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PREFACE 

This book contains lectures on matrices given at Princeton University at 
various times since 1920. It was my intention to include full notes on the his- 
tory of the subject, but this has proved impossible owing to circumstances 
beyond my control, and I have had to content myself with very brief notes (see 
Appendix I). A bibliography is given in Appendix II. In compiling it, espe- 
cially for the period of the last twenty-five years, there was considerable difficulty 
in deciding whether to include certain papers which, if they had occurred earlier, 
would probably have found a place there. In the main, I have not included 
articles which do not use matrices as an algebraic calculus, or whose interest lies 
in some other part of mathematics rather than in the theory of matrices; but 
consistency in this has probably not been attained. 

Kinee these lectures have been prepared over a somewhat lengthy period of 
time, they owe much to the criticism of many friends. In particular, Professor 
A. A. Albert and Dr. J. L. Dorroh read most of the MS making many sugges- 
tions, and the former gave material help in the preparation of the later sections 
of Chapter X. 

J. If. M. W KDDWUiUItN. 

Princeton, N. J., 

July 20, UKV,. 
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CHAPTER I 

Matrices and Vectors 

1.01 Linear transformations and vectors. In a set of linear equations 
Vl — fl'llVl + (hM2 + ' * * 4* dlnVn 

V'l = ^2lVl + (h2V2 + ' * ' + (hnVn 


V n = ®nlVl + QnWS + * * * + ClnnOn 

or 

n 

(1) rt J 2 a a r ii 0 =1,2, • • •, n) 

/ = l 

the quantities 771, 770, •• •, rj n may be regarded as the coordinates of a point P in 
//-space and the point r}' 2) • • •, Vn) is then said to be derived from P by 

the linear homogeneous transformation (1). Or, in place of regarding the rfs as 
the coordinates of a point we may look on them as the components of a vector y 
and consider (1) as defining an operation which transforms y into a new vector 
We shall be concerned here with the properties of such transformations, 
sometimes considered abstractly as entities in themselves, and sometimes in 
conjunction with vectors. 

To prevent, misconceptions as to their meaning we shall now define a few terms 
which are probably already familiar to the reader. By a scalar or number we 
mean an element of the field in which all coefficients of transformations and 
vectors are supposed to lie; unless otherwise stated the reader may assume 
tha t a. scalar is an ordinary number real or complex. 

A vector 1 of order n is defined as a set. of n scalars (£ b £2, • • •, £„) given in a 
definite order. This set, regarded as a single entity, is denoted by a single 
symbol, say ,r, and wo write 

•t* = (si) £2, • • *, in)- 

The scalars £1, £ 2 , • • •, £„ are called the coordinates or compound* of the vector. 

If y = (77,, 772, • ■ *, rj n ) is also a vector, we say that .r = y if, and only if, cor- 
responding coordinates arc equal, that is, £j = rn(i =1,2, • ■ -, n). r rhc vector 

2 = (fi, fa, • *, f«) = (£1 + Vl) £2 + V 2 , • • £« + Vn) 

is called the sum of .r and y and is written x + y; it is easily seen that the opera- 
tion of addition so defined is commutative and associative, and it has a unique 
inverse if we agree to write 0 for the vector (0, 0, • • • , 0). 

1 In chapter 5 we shall find it convenient to use the name hypernumber for the term 
vector which is then used in a more restricted sense, which, however, does not conflict 
with the use made of it here. 
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If p is a scalar, we shall write 

px = xp = (p? 1, phi • ' ') Phi)- 

This is the only kind of multiplication we shall use regularly in connection with 
vectors. 

1.02 Linear dependence. In this section we shall express in teims ot 
vectors the familiar notions of linear dependence. 2 Ix X 2 , * * *? x r me vee- 
tors and coi, 002 , • • • , scalars, any vector of the form 

(2) X = W 1 X 1 + 0 ) 2 X 2 + * • * + U r X r 

is said to be linearly dependent on xi, x 2 , • • *, x r ; and these vectors ace called 
linearly independent if an equation which is reducible to the form 

0 = COi#i + CO 2 .T 2 + • • • + 0) r X r 

can only be true when each a j* = 0. Geometrically the r vectors determine, an 
r-dimensional subspace of the original n-space and, if X], x 2 , * * * , x r are taken as 
the coordinate axes, «i, o> 2 , • • • , w r in (2) are the coordinates of x. 

We shall call the totality of vectors x of the form (2) the linear set or svhspace 
(xi, x 2j •••, s r ) and, when x x , x 2 , •••, x r are linearly independent, they am 
said to form a basis of the set. The number of elements in a basis of a sot is 
called the order of the set. 

Suppose now that (x x , x 2 , •••, x r ), (yi, ys, •••» y a ) are bases of the same 
linear set and assume s > r. Since the x’s form a basis, each y can be expressed 
in the form 

(3) yi 5=2 ai\X\ + di 2 X 2 “H ’ * * "T" airXr 1? 2, * * •, »s) 

and, since the ?/s form a basis, we may set 

Xi = bnyi + + • * * + bi s y 8 (i = 1, 2, * • *, r) 

and therefore from (3) 

(4) Vi = 2 ai ‘ x i = 2/ fll7 2/ h ‘ kVk = S' 

y = 1 3 = 1 k = 1 A: — L 

r 

where c** = ^ a z -yby/ : , which may also be written 
3 = 1 

s 

(5) Ci/c == ^ y aijbj/c (i == 1, 2, *, «s*) 

y = 1 

if we agree to set a*y = 0 when j > r. Since the ?/\s are linearly independent, 
(4) can only hol,d true if cu = 1, Cu c = 0 (i ^ k) so that the determinant 


2 See for instance B6cher, Introduction to Higher Algebra , p. 34. 
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| on, | = 1. But from the rule for forming the product of two determinants it 
follows from (5) that | cm | = | am || bm | which implies (i) that | am | ^ 0 and 
(ii) that r = s, since otherwise | am | contains the column ai , r + 1 each element 
of which is 0. The order of a set is therefore independent of the basis chosen 
to represent it. 

It follows readily from the theory of linear equations (or from §1.11 below) 
that, if | (iij \ 0 in (3), then these equations can be solved for the x’s in terms 

of the ?/\s, so that the conditions established above are sufficient as well as 
necessary in order that the y ’ s shall form a basis. 

If d denotes the vector whose ith coordinate is 1 and whose other coordinates 
are 0, we see immediately that we may write 

X = + i*2^2 + • * * + £u<3n 

in place of x = (&, ••*, ?»). Hence e h e 2 , e n form a basis of our 

n-space. We shall call this the f undamental basis and the individual vectors ei 
the fundamental unit vectors. 

If xy, x 2 , • ••, x r (r < n) is a basis of a subspace of order r, we can always 
find n r vectors x r + i, ••*, x n such that xi, x 2j ••*, x n is a basis of the 
fundamental space. For, if x r + 1 is any vector not lying in x 2 , •••, x r ), 
there cannot be any relation 

0)lX\ “f" 00 2 X 2 + * * * “b 01 r x r 4“ CO r 4 . i&’r 4 . 1 = 0 

in which +. i ^ 0 (in fact every co must be 0 ) and hence the order of (aq, 
.To, • • • , x n x r |. i) is r + 1. Since the order of (e h c. 2) • ••, e n ) is n, a repetition 
of this process leads to a basis xi, x 2 , • ■ x rj • • *, x n of order n after a finite 
number of steps; a suitably chosen (\ may be taken for x r ~i i. The (n-r )- space 
(x r 4 - 1 , • • •, x n ) is said to be com piemen tarn to (;ri, x 2 , * * *, x r ) ; it- is of course 
not uni(iue. 

1.03 Linear vector functions and matrices. The set of linear equations 
given in §1.01, namely, 

n 

((>) f] { “ y tin :jVi (t ^ ') -4 

i I 

define the vector y' = (rj[, y',, •••, y' n ) us a linear homogeneous function of 
the coordinates of y = (y t , y->, • ■ y n ) and in accordance with the usual func- 
tional notation it is natural to write 1 / = A(y); it is usual to omit the brackets 
and wc therefore sot in place of (6) 

y' = Ay. 

The function or operator A when regarded as a single entity is called a 
matrix; it is completely determined, relatively to the fundamental basis, when 
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the n 2 numbers an are known, in much the same way as the vector y is deter- 
mined by its coordinates. We call the an the coordinates of A and write 


an 

Ol2 

Q/ln 

021 

a 2 2 * * • 

a 2n 

a n i 

C?n2 * * * 

Onn 


or, when convenient, A = || an ||. It should be noted that in an the first suffix 
denotes the row in which the coordinate occurs while the second gives the 
column. 

If B = || bn || is a second matrix, y n = A(By) is a vector which is a linear 
vector homogeneous function of 2/, and from (6) we have 


Vi 


— ^ ; ai p y bpjVi — dnVi) 


p= 1 


p - 1 


where 


( 8 ) 


2 at - 
1 


p/* 


The matrix D — || dj,-|| is called the product of A into B and is written A B. 
The form of (8) should be carefully noted; in it each element of the ith row of A 
is multiplied into the corresponding element of the jth column of B and the 
terms so formed are added. Since the rows and columns are not interchange- 
able, AB is in general different from BA; for instance 


1 0 


a b 


a b 

2 1 


c d 


2 a + c 2 b + d 

a b 


1 0 


a -|- 26 h 

c d 


2 1 


c -f - 2d d 


The product defined by (8) is associative; for if C = |j c., ||, the element in 
the ith row and jth column of (AB)C is 




2 


a ip 



and the term on the right is the (i, j) coordinate of A{BC). 

If we add the vectors Ay and By, we get a vector whose ith coordinate is 

(cf- (6)) 
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where cy = ay fry. Hence Ay + Uy may be written Cy where C = || c,y ||. 
We define (7 to be the sum of A and B and write C — A -f- B) two matrices 
are then added by adding corresponding coordinates just as in the case of vec- 
tors. It follows immediately from the definition of sum and product that 

A + B = B + A, (A + B) + C = A + (B + C), 

A{B + C) = AB + AC, ( B + C)A = BA + CA, 

A(x + y) = Ax + Ay, 

A, B, C being any matrices and x, y vectors. Also, if k is a scalar and we set 
y' = Ay, y" = ley’, then 

y" = ky' = kA(y) = A(ky) 

or in terms of the coordinates 

Vi ~ ^ y kOtijfjj. 

3 


Hence kA may be interpreted as the matrix derived from A by multiplying 
each coordinate of A by k. 

On the analogy of the unit vectors e* we now define the fundamental unit 
matrices Cij (i, j = 1,2, • • •, n). Here c !»•/ is the matrix whose coordinates are 
all 0 except the one in the ith row and jih column whose value is 1. Corre- 
sponding to the form for a vector we then have 


n 

0) A = ^ 

i, y = l 

Also from the definition of multiplication in (8) 

(Ifi) Cij/’y/j Cite, CijC/nj — 0, (j ’/)) 

a set of relations which might have been made the basis of the definition of the 
product of two matrices. It should be noted that it follows from the defini- 
tion of cij that 

(11) Cifij — c,, CijCk “ 0 (j ^ A*), 


A(‘k j . ( liicCj . 


Hence the coordinates of Ac k are the coordinates of A that lie in the Ath column. 


1.04 Scalar matrices. If k is a scalar, the matrix K defined by Ky = ky 
is called a scalar matrix ; from (1) it follows that, if K = || Ay ||, then Ay = k 
( J ' = 1,2, • • •, n) , ku = 0 (i ^ j). The scalar matrix for which A = 1 is called 
the identity matrix of order n\ it is commonly denoted by 1 but, for reasons 
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explained below, we shall here usually denote it by 1, or by 1„ if it is desired 
to indicate the order. When written at length we have 


1 


k 

1 


k 

. 

II 

. 

1 


k 


A convenient notation for the coordinates of the identity matrix was intro- 
duced by Kronecker: if 5 i} - is the numerical function of the integers i, j 
defined by 

(13) 8 a = 1, Sq =0 (t M j), 

then l n = || on ||. We shall use this Kronecker delta function in future wilh- 
out further comment. 


Theorem 1 . Every matrix is commutative with a scalar matrix. 


Let k be the scalar and K = || ft,-,- 1| = ||£5i,-|| the corresponding matrix. 
If A = || a iS || is any matrix, then from the definition of multiplication 


KA = 

AK = 





kaq 


|| | 


so that AK = KA. 

If k and h are two scalars and I\, II the corresponding scalar matrices, then 
K + H and KII are the scalar matrices corresponding to k + h and kh. ITenee 
the one-to-one correspondence between scalars and scalar matrices is main- 
tained under the operations of addition and multiplication, that is, the two 
sets are simply isomorphic with respect to these operations. So long therefore 
as we are concerned only with matrices of given order, there is no confusion 
introduced if we replace each scalar by its corresponding scalar - matrix, just 
as in the theory of ordinary complex numbers, (a, b) = a + hi, the set of num- 
bers of the form (a, 0) is identified with the real continuum. We shall there- 
fore as a rule denote || 8 {j || by 1 and || k8 u || by k. 


1.05 Powers of a matrix; adjoint matrices. Positive integral powers of 
A = || an || are readily defined by induction; thus 

A* = A -A, A 3 = A - A 2 , ■ • ., A" = A -A'" - «. 

With this definition it is clear th.at A r A* = A r + s for any positive integers r, ,v. 
Negative powers, however, require more careful consideration. 
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Let the determinant formed from the array of coefficients of a matrix be 
denoted by 

| A | = det. A 

and let a qp be the cofactor of a pq in A, so that from the properties of deter- 
minants 

(14) (lipOip j | A 1 8{j == y Ot ipdp j (ij j = 1, 2, * * *, Tl)» 

V P 

The matrix || || is called the adjoint of A and is denoted by adj A. In this 

notation (14) may be written 

(15) A (adj A) = \A | = (adj A) A, 

so that a matrix and its adjoint are commutative. 

If | A | 0, we define A~ l by 

(16) A- 1 = | A I"' 1 adj A. 

Negative integral powers are then defined by A~ r = (A~~ 1 ) r ; evidently A"“ r = 
(A r ) b We also set A° — 1, but it will appear later that a different inter- 
pretation must be given when | A | = 0. Since AB-B~'A~ l = A-BB l -A~ l = 
AA~ l = 1, the reciprocal of the product AB is 

(AB)~ l = Ii-'A-K 

If A a, ml B a, re matrices, the rule for multiplying determinants, when stated 
in our notation, becomes 

I AB | = \A\\B\. 

In particular, if AB = 1, then | A \ \ B | = 1 ; hence, if | A | = 0, there is no 
matrix B such (hat AB = 1 or BA = 1. The reader should notice that, if k 
is a scalar matrix of order n, t hen | k | = k". 

If A = 0, A is said to be singular ; if A 0, A is regular or non-singular. 
When A is regular, A 1 is the only solution of AX = 1 or of XA = 1. For, 
if AX — 1, (hen 

A- 1 = A l A = A~~ l A X - Ah 


If A A r = 0, t hen cither X = 0 or A is singular; for, if A"' exists, 


0 = A: 'Ax = X. 


If A - = .1 0, then A is said lobe idempolcnl ; for example e n and 



are idempofenl. A matrix a power of which is 0 is called nilpolenl. If the 
lowest, power of A which is 0 is A r , r is called the La/rx of A ; for example, if A 
= C 1 2 + + C;m, til Cl i 


A“ — Cj3 T" C?24 , A.' 1 — C’j4, A 4 = 0, 

so that the index of A in this case is 4. 
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1.06 The transverse of a matrix. If A = ||a*||, the matrix || a/ y || in 
which a'. = is called the transverse 3 of 4 and is denoted by A'. For instance 
the transverse of 


11 

a > i2 

#13 

#11 

#21 

0 31 

&21 

&22 

<*23 is 

«12 

$22 

a 30 

^31 

&32 

0 > 33 

#13 

#23 

#33 


The transverse, then, is obtained by the interchange of corresponding rows and 
columns. It must be carefully noted that this definition is relative to a par- 
ticular set of fundamental units and, if these are altered, the transverse must 
also be changed. 

Theorem 2. The transverse of a sum is the sum of the transverses of the sepa- 
rate terms , and the transverse of a product is the product of the transverses of the 
separate factors in the reverse order. 

The proof of the first part of the theorem is immediate and is left to the 
reader. To prove the second it is sufficient to consider two factors. Let 
A = |[ a l 7 J|, B = IM, C = AB = || c, 7 || and, as above, set a' 7 = ap, 
bn = bj i} c'a = c/i,*then 

c ij — fyi = ajpbpi = b ip a p j 

V p 

whence 

C AB Y = C = B'A'. 

The proof for any number of factors follows by induction. 

If A = A\ A is said to be symmetric and, if A = - A ', it is called skew- 
symmetric or skew. A scalar matrix k is symmetric and the transverse of 
JcA is kA\ 

Theorem 3. Every matrix can be expressed uniquely as the sum of a sym- 
metric and a skew matrix. 

For if A = B + C, B' = B, C' = -C, then A' = B' + C' = B - C and 
therefore 

B = (A + A’)/ 2, C = (A - A') 12. 

Conversely 2A = (A + A ') + (.1 - A') and A + A' is symmetric, A — A' 
skew. 

3 It is also called the transposed or conjugate of A. It is sometimes written A . 
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1.07 Bilinear forms. A scalar bilinear form in two varia^te^yeSoi's, 

y = Sijje,-, is a function of the form ' : . 


^i ( 'h V — is a function of the form 
( 17 ) 4(», ?;) = 2) &i£em- 


i,2= l 


I here is therefore a one-to-one correspondence between such forms and ma- 
frices A = || a it - 1| corresponding to A(x, y). The special form for which A = 

II S { j || = .1 is of very frequent occurrence and we shall denote it by S; it is 
convenient to omit the brackets and write simply 

( ,S ) Sxy = + b>h + ■■■ + £nVn 


and, because of the manner in which it appears in vector analysis, we shall 
call it the scalar of xy. Since S is symmetric, Sxy = Syx. 

The function (17) can be conveniently expressed in terms of A and S; for 
we may write A{x, y) in the form 

y) = 2} & ( 2) = SxAy. 

It may also be written 


i = l 


\ y ** i 


hence 

( 19 ) 



2) k = SA'xy = SyA'x; 

j ™ 1 \ i ™ 1 / 

SxAy = SyA'x, 


so filial. Ihe form (17) is unaltered when x and y are interchanged if at the same 
time .1 is changed into A'. This gives another proof of Theorem 2. For 

Sx(AB)'y = SyAHx = SBxA'y = Sxlt'A'y, 

which gives {AH)' = H'A' since x and y are independent variables. 

1.0S Change of basis. We shall now investigate more closely the effect of 
a, change in the fundamental basis on the coordinates of a vector or matrix. 
H’ fh h ■ ", In is a basis of our w-space, we have seen (§1.02) that the /’s arc 
linearly independent. Let 


( 20 ) 


(*' = 1, 2, v) 


ft 2y Va c i Hc'i 

P = \\Vij\\. 

Since the/’s form a basis, the e’s are linearly expressible in terms of them, say 
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and, if Q = || |[, this may be written 

(22) fii = 2 qji ^ p ki e k = PQei (i = 1, 2, • ■ n). 

i k 

Hence PQ = 1, which is only possible if | P | ^ 0, Q = P” 1 . 

Conversely, if | P \ ^ 0, Q = P~ l , and /; = Pei as in (20), then (22) holds 
and therefore also (21), that is, the e’s, and therefore also any vector x, arc 
linearly expressible in terms of the f s. We have therefore the following 
theorem. 

Theorem 4. If fi = Pei (i = 1, 2, • • •, n), the vectors fi form a basis if, and 
only if, \ P\ 0 » 

If we have fewer than n vectors, say fi, / 2 , • • • , / r , we have seen in ,1 .02 
that we can choose f r + i, •••,/» so that /i, / 2 , • • *, /» form a basis. lienee 

Theorem 5. // /i, / 2 , / r are linearly independent, there exists at least 
one non-singular matrix P such that Pet = /< (i = 1, 2, • * *, r). 

We shall now determine how the form which was defined relati v(dy to 
the fundamental basis, is altered by a change of basis. As above let 

(23) U = P 6i , ei = P-% = Qfi, | P | ^ 0, (i = 1, 2, • - -, w) 

be a basis and 

x = = 2?;./,, y = = V</< 

variable vectors; then from (23) 

* = = P2f / < « < , y = Q'ZrjJi = P2^ 

and 

= P” 1 # = Qz, 2i7je< = Q?/. 

Let us set temporarily Say for Sxy and also put S/xy = 2^. vj! , the correspond- 
ing form with reference to the new basis; then 

^ S/xy = SeQxQy = SaQ'Qy 

Say = SfPxPy. 

Consider now a matrix A = ||a iy || defined relatively to the fundamental 
basis and let Ai be the matrix which has the same coordinates when expressed 
in terms of the new basis as A has in the old. From the definition of A and 
from £/ = Se0jX we have 

~~ ~ ^ y a>ijCiS t .CjX 

i >j i, i 
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A\X i 2 ClijfiS/fjX — 2/Cti/Q * C i$eQf ?Q % 

( 25 ) 

- Q-iSayeiSjjQx = Q-UQa. 

We have therefore, remembering that Q = P” 1 , 

Theorem 6. If f t — Pei (i = 1, 2, • • • , n) is a basis and A any matrix , if/ze 
matrix PAP “ 1 Aas tf/ze same coordinates when expressed in terms of this basis as A 
has in terms of the fundamental basis . 

The matrix (2“ ] A() is said to be similar to A and to be the transform of A 
by Q. Obviously the transform of a product (sum) is the product (sum) of 
the transforms of the individual factors (terms) with the order unaltered. For 
instance Q-'ABQ = Q~ l AQ'Q^ l BQ. 

Theorem 6 gives the transformation of the matric units c<y defined in §1.03 
which corresponds to the vector transformation (23); the result is that, if f i3 - 
is the unit in the new system corresponding to e^, then 

fij = Pe i} P~ l 

which is readily verified by setting 

A = Gij = e.iScCj( ), Ai = fij = fiS/fj( ) 

in (25). The effect of the change of basis on the form of the transverse is 
found as follows. Let A* be defined by 

S/zAy — ti/yA *x; 

tlien 

S/!jA*x = S/xAy = ti 0 QxQAy = tiaQ'QAy = S/}y(Q')A'Q'Qx 
= Njy(Q'Q)A'Q'Qx. 

Hence 

(26) A* = (Q'Q)A'Q'Q. 

1.09 Reciprocal and orthogonal bases. With the same notation as in the 
previous section we have tijfJj = 0 (i ^ j), tiff *■// = 1. Hence 

*u = Wy = WiQf; = >SV/ i Q'C>/y. 

If, therefore', we set 

(27) /; = Q'Qfi (i = 1,2, •*., n), 
we have, on omitting the subscript c in &, 

(28) w; = *</ (hi = 1,2, •••, n). 

Since | Q'Q I ^ 0, the vectors f [ , f ' 2 , • * • , f' n form a basis which w r e say is recip- 
rocal to fij ft, • • ■ , / n . This definition is of course relative to the fundamental 
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basis since it depends on the function S but, apart from this the basis (/,. ) is 
uniquely defined when the basis (/*•) is given since the vectors /»• determine P 
and Q = P~ l . 

The relation between (j!) and (/. ) is a reciprocal one; for 

/; = Q'Qfs - = Q'* . 

and, if 22 = (Q')“S we have/,- == R'Rf], 

If only the set (/i, / 2 , • • •, / r ) is supposed given originally, and this set of 
linearly independent vectors is extended by f r + 1 , • • • , f n to form a basis of 
the w-space, then /' + x , ••*,/' individually depend on the choice of f r + i f 
• • f n . But (28) shows that, if Sf& = 0 (i = 1, 2, ••*, r), then a; belongs to 
the linear set (/' +i, •••,/*); hence this linear set is uniquely determined 
although the individual members of its basis are not. We may therefore with- 
out ambiguity call §' = (/' + 1 , •••, /') reciprocal to $ = (/i, / a , **•, / r ); 
S' is then the set of all vectors a; for which = 0 whenever y belongs to Jy. 

In a later chapter we shall require the following lemma. 

Lemma 1 . If (f h f 2 , * • •, f r ) and (f' r +i, • • *, f' n ) are reciprocal , so a&o are 
(5-% B~ l f 2 , • • P -1 /,.) and ( B'f' r +1 , B'f' r +2 , • • •, B'f'J where B is any non - 

singular matrix . 

For &B'f i Br'f i = P/-BB -1 /,- = S/J/, = 5 l7 . 

Reciprocal bases have a close connection with reciprocal or inverse matrices 
in terms of which they might have been defined. If P is non-singular and 
= fi as above, then P = 2/ t vSe;( ) and, if Q = HeiSfii ), then 

PQ = ZeiSf'jjSeji ) = ) = 1 

so that Q = P" 1 . 

If = 1, the bases (/*) and (/•) are identical and >Sy t / y = y <7 for all / and j; 
the basis is then said to be orthogonal as is also the matrix Q. The inverse of 
an orthogonal matrix and the product of two or more orthogonal matrices are 
orthogonal; for, if RR' = 1, 

(. RQKRQY = RQQ'R' = RR' = 1. 

Suppose that hi, h 2) • * *, h r are real vectors which arc linearly independent, 
and for which Shjij = <$ l7 (i ^ j); since Iu is real, we have Shjii ^ 0. If r < n, 
we can always find a real vector x which is not in the linear set (/q, • • ■, h r ) 
and, if we put 

r 

h r+ i = x - 2 hiShix/Sluhi, 

1 

then h r + i ^ 0 and Shji r + 1 = 0 (i — 1, 2, - • •, r). Hence we can extend the 
original set to form a basis of the fundamental n-space. If we set fi = 
hi/{Shihi)%, then Sfifj = 8 if even when i = j; this modified basis is called an 
orthogonal basis of the set. 
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If the vectors hi are not necessarily real, it is not evident that x can be chosen 
so that Sh r + ih r + 1 5^ 0 when ShM ^ 0 (i = 1, 2, ••*, r). This may be 
shown as follows. In the first place we cannot have Syh r + i = 0 for every y } 
and hence Sh r + ih r + i ^ 0 when r = n — 1. Suppose now that for every 
choice of x we have Sh r + ih r + 1 = 0; we can then choose a basis h r + 1 , • • • , h n 
supplementary to h h •••, h r such that Shfa = 0 (i = r + 1, • • n) and 
Shihj = 0 (i = r + 1, n; j = 1, 2, •••, r). Since we cannot have 
Sh r + 1 hi = 0 for every hi of the basis of the w-space, this scalar must be differ- 
ent from 0 for some value of i > r , say r + k. If we then put h' r + 1 = h r + 1 
+ hr + k in place of h r + 1, we have Shjh' r + i = 0 (i = 1, 2, • • •, r) as before 
and also 

SJl r _|« l ll r -)- l Sllr -|- 1 hr -|- 1 “f* Shr -j- khr -f k ~f" *2Shr -)- 1 hr -f- k 

= 2 SJl r + }h r -j- Ic 7 ^ 0. 

We can therefore extend the basis in the manner indicated for real vectors 
even when the vectors are complex. 

When complex coordinates are in question the following lemma is useful; 
it contains the case discussed above when the vectors used are real. 

Lemma 2. When a linear set of order r is given , it is always possible, to choose 
a basis g h g 2 , • • •, g n of the fundamental space such that g h • • •, g r is a basis 
of the given set and such that Sgifij = 6,7 where f/j is the vector whose coordinates 
arc the conjugates of the coordinates of g } - when expressed in terms of the funda- 
mental basis. 

The proof is a slight modification of the one already given for the real case. 
Suppose that g h g s arc chosen so that Sg^j = 6,7 (i, j = 1, 2, •••, s) 
and such that (g 1, • • g s ) lies in the given set when s < r and when 6“ > r, 
then (/j, • • •, g r is a basis of tins set. We now put 

g'„ + , = x — 2 j Oifylir/XfUOi 
1 

which is not 0 provided x is not in (g h • ••, g H ) and, if s < r, will lie in the 
given set provided x does. We may then put 

(h + 1 = g' H +i/(Sg' s + 1 ff« + i) s 

and the lemma follows readily by induction. 

If U is tlic matrix ^cSg% t then U = SdSgi and 

(29) UU' = 1. 

Such a matrix is called a unitary matrix and the basis g h g 2 , • • *, g n is called a 
unitary basis. A real unitary matrix is of course orthogonal. 
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1.10 The rank of a matrix. Let A = || an 1 1 be a matrix and set (cf. (12) 
§1.03) 


Jl% — A&i — CLj\&j , 

then, if 

x = 2 fad = 'ZeiSeiX 

is any vector, we have 

Ax = AZeSeiX = ZAeSea 
or 

n 

(30) Ax = 2 hiSeiX. 

i 

m 

Any expression of the form Ax = ^ a t SbiX, where a,', 6; are constant vec- 

1 

tors, is a linear homogeneous vector function of x. Here (30) shows that it 
is never necessary to take m > n, but it is sometimes convenient to do so. 
When we are interested mainly in the matrix and not in x, we may write A 
== 2a,iSbi( ) or, omitting the brackets, merely 

(31) A = ZcuSbi. 

It follows readily from the definition of the transverse that 

(32) A' = ZbiSai. 

No matter what vector x is, Ax } being equal to SciiSbiX, is linearly dependent 
on a h a 2 , • • •, a m or, if the form (30) is used, on h h h 2 , • • •, h n . ' When | A | 
0, we have seen in Theorem 4 that the h ’ s are linearly independent but, if A 
is singular, there are linear relations connecting them, and the order of the 
linear set (a h a 2 , • • •, a m ) is less than n. 

Suppose in (31) that the a’ s are not linearly independent, say 

a s = a\di + oi 2&o + * • • + a 3 _ ia 8 _ i, 

then on substituting this value of a a in (31) we have 


A - + <*!&.) + • ■ • + a, _ lt S(b s _ i + « s _ £,) + ^ ai Sbi, 

8 + 1 

an expression similar to (31) but having at least one term less. A similar 
reduction can be carried out if the b’s are not linearly independent. After a 
finite number of repetitions of this process we shall finally reach a form 

(33) A = 2) CiSdi 

1 
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in which c h c 2 , ■ ■ ■, c r are linearly independent and also d u d 2 , ■ ■ d r . The 
integer r is called the rank of A. 

It is clear that the value of r is independent of the manner in which the 
reduction to the form (33) is carried out since it is the order of the linear set 
(Aei, Ae 2 , Ac n ). We shall, however, give a proof of this which inci- 
dently yields some important information regarding the nature of A. 

Suppose that by any method we have arrived at two forms of A 

r « 

■4 = 2) CiSdi = 2 ViSqi, 

1 1 

where (c h c 2 , • • c r ) and (d 1} d 2 , •••, d r ) are spaces of order r and (p h p 2 , 

--•> Pa), tel , ( 7 * 2 , * * •, q B ) spaces of order s, and let (c' + i, c' r + 2 , • * *, c'), • • •, 

tel (1 1 + 2 , • (/n) be the corresponding reciprocal spaces. Then 

s 

M'j = 2 P iS< w'i = Pi O' = 1, 2, • • •, s) 

1 

and also Aq) = IS CiSckc/j. Hence each py lies in (ci, c 2 , •••, c r ). Similarly 
each Ci lies in (pi, p 2 , • • p«) so that these two subspaces are the same and, 

in particular, their orders are equal, that is, r = a. The same discussion with 
A' in place of A shows that (d 1; d- 2 , • • d r ) and ten q 2 , • *, q*) are the same. 

We shall call the spaces ®i = (c h c 2 , • • c r ), 0») r = ten d 2 , • • d r ) the left 

and right ground# of /l, and the total space W = (c h • ••, c r , di, • • *, d r ) will 

bo called the (total) ground of A . 

If ;r is any vector in the subspace s Ji r = (d' r + i, d[ + 2 , * • *, d') reciprocal 
to (M r , then d.r == 0 since Sd id) — 0 (i ^ j). Conversely, if 

0 - Ax = 2 cSdiX, 

each multiplier Sd^x must be 0 since the c’s are linearly independent; hence! every 
solution of Ax = 0 lies in s Ji r . Similarly every solution of A'x = 0 lies in 
s Jf/ = (c' r + i, c' r + 2 , • • •, c' n ). We call s )f r and s Jib the right and left null spaces 
of A ; their order, n — r, is called the nullity of /t. 

We may summarize these results as follows. 

r 

Tu hokum 7. If a matrix A is ex pressed in the form aiSbi, 'inhere Ob 

i 

= ten tfa, ■ • *, a r ) a ad Ob = (&j, Z> 2 , * * *, b r ) define spaces of order r, thou, no 
matter how the reduction to this form is carried out, the spaces Oh and Ob arc always 
the same. Further , if s lt/ and 9J r are Z//e spaces of order n — r reciprocal to Ob 
aad Ob, respectively, every solution of Ax = 0 Zies m s 0b a ad every solution of 
A'x = 0 ra 9b. 

The following theorem is readily deduced from Theorem 7 and its proof is 
left to the reader. 
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Theorem 8. If A, B are matrices of rank r, s, the rank of A + B is not 
greater than r + $ and the rank of AB is not greater than the smaller of r and s. 

1.11 Linear dependence. The definition of the rank of a matrix in the 
preceding section was made in terms of the linear dependence of vectors associ- 
ated with the matrix. In this section we consider briefly the theory of linear 
dependence introducing incidentally a notation which we shall require later. 

n 

Let Xi = e 3 - (i = 1, 2, • • r; r < n) be a set of r vectors. From the 

j-i 

rectangular array of their coordinates 

£n £12 * • • fin 

&l f22 * * * £2 n 

(34) 


frl £r2 * * * £ rn 

there can be formed nl/r!(n — r)! different determinants of order r by choosing 
r columns out of (34), these columns being taken in their natural order. If 
these determinants are arranged in some definite order, we may regard them 
as the coordinates of a vector in space of order n!/r!(n — r)l and, when this 
is done, we shall denote this vector by 4 

(35) | x x x 2 • • • x T \ 

and call it a pure vector of grade r. It follows from this definition that | x } x>> 
• • • x r \ has many of the properties of a determinant; its sign is changed if two 
a: ; s are interchanged, it vanishes when two x’s are equal and, if X and n arcs 
scalars, 

(36) | (\xi + ' Xr \ = X | X t X 2 * * * Xr | + jU | x[xi * * * X r |. 

If we replace the x’s in (35) by r different units a i} • • •, Ci r , the result is 
clearly not 0: we thus obtain (?) vectors which we shall call the fundamental 
unit vectors of grade r; and any linear combination of these units, say 

" £iii2 • • • ir I ' * * e% r } , 

is called a vector of grade r. It should be noticed that not every vector is a 
pure vector except w T hen r equals 1 or n. 

If we replace x% by S in (35), we get 

I x x x 2 * * * x r | = 2 £ia£2 jr • • * £;-y r | Cj l Cj a • • • Cj r |, 

where the summation extends over all permutations j h j 2 , • • •, j r of 1,2, • * *, n 
taken r at a time. This summation may be effected by grouping together the 

4 If it had been advisable to use here the indeterminate produet of (Irassmann, (:*5) 
would appear as a determinant in much the ordinary sense (of. §5.00). 
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sets ji, j%, • • ■ , j r which are permutations of the same combination ii, i 2 , • ■ ■ , i rt 
whose members may be taken to be arranged in natural order, and then sum- 
ming these partial sums over all possible combinations i h i 2 , • • • , i r . Taking 
the first step only we have 

S I e h e h ■■■ e ir \ = 2 «J;:" )\ $ Ul ■ ■ ■ £ rjV | e t -e i2 . - - etr | 

where 8}\ }' is the sign corresponding to the permutations (*•”>:" j r r ) and this 
equals | fw, • • ■ £ rir | | c,-, • • ■ e ir |. We have therefore 

* 

I Xlft • • • X r \ = | {lij&ij • • • £n> I I Cifii., ■ • • e ir I, 

(O 

where the asterisk on h indicates that the sum is taken over all r-combinations 
of 1, 2, • **, n each combination being arranged in natural order. 


I MEOiiEM 9 | XiX* x r | — 0 if, and only if, x\ , x*, x r arc linearly 
dependent. 

i he first part of this theorem is an immediate consequence of (36). To prove 
the converse it is sufficient to show that, if | XiXz * * * x r — 1 | 9 ^ 0, then there 
exist scalars «i, « 3 , • • •, Qf r -i such that 


Xr — Ot\X\ “b Qfo-ro -f- 


~f“ &r _ \X r _ j. 




Lot :ii 2—1 Sijt'i- Since | x-iXt • • • x\ _ 1 | y* 0, at least one of its coordinates 

j 

is not 0, and for convenience we may suppose without loss of generality that 

0^) | £|,&> 2 • • • _ j, r _ j | (). 

Since | x,.r-< ■ ■ • x r | = 0, all its coordinates equal 0 and in particular 


[ £il&!2 • • ’ £r - I. r - 1 £r» 1 = 0 (i = 1,2, • • •, /(,). 


C t • C . 

S r — 1 1 r — Is n 


If we expand this determinant, according to the elements of its last, column, 


we get a relation of the form 


+ ftfi / + * • • + ... 1,1 = 0 

vvheie the (j s are independent of z and ^ 0 by (3<X). lienee we may write 

( ; M) S ri = milt + * • * + Qfr -- 1 ir - 1. i (1 = 1 , 2, * • • , *//.) 

(he of’s being independent of t. Multiplying (39) by n and summing with 
regard to /, we have 


.r ,- — ai.ri + ■ ■ • + Qf r _ i x r - 1, 
which proves the theorem. 

If (a i , a*, * • *, <v) is a linear set of order r, then some set of r a? s form a 
basis, that is, are linearly independent while each of the other a ! s is linearly 
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dependent on them. By a change of notation, if necessary, we may take 
ai, 02 , • ■ • , a r as this basis and write 


(40) 


a r + i = 2 M/. (*' = 1» 2 > • • '» m ~ f )- 


3 = 1 


We shall now discuss the general form of all linear relations among the a s in 
terms of the special relations (40); and in doing so we may assume the order 
of the space to be equal to or greater than m since we may consider any given 
space as a subspace of one of arbitrarily higher dimensionality. 

Let 

m 

(41) 2 TjOj = 0 

1 

be a relation connecting the o’s and set 

m 

c = 2 
1 

Then (40), considered as a special case of (41), corresponds to setting for c 

r 

(42) Ci — ^ y H” &r + ij (i = 1, 2, ’ ' ' j W% /) , 

y = i 

and there is clearly no linear relation connecting these vectors so that they 
define a linear set of order m — r. Using (40) in (41) we have 


^ y f 7 3 “1“ ^ y 7 r + i flijj — 0 


3=1 \ i = 1 

and, since ai, a 2 , • • • , a r are linearly independent, we have 


; fiifYr 


+ i 


(j = 1; 2, •••, r) 


4 = 1 


whence 


(43) C = 2 7jC, = ~ 2) Tr + .• 2 + 2 'D- + i'V + i = 2 7r + < '’ <l 

1 1 = 1 3=1 1 = 1 i - 1 

so that c is linearly dependent on c h c 2 , • • •, c m _ r . Conversely, on retracing 
these steps in the reverse order we see that, if c is linearly dependent on those 
vectors, so that y r + < {i = 1, 2, ■■■, m — r) are known, then from (43) the 

m m 

7 ,- 0‘ = 1, 2, ■ ■ • , r) are defined in such a way that c = 2 W and J) 7/0/ = 

i i 

0. We have therefore the following theorem. 
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Theorem 10. If a h a 2) • • • , a m is a linear set of order r, there exist m — r 

m 

linear relations y^-aj = 0 (i = 1, 2, • • •, m — r) such that (i) the vectors 

m y « l 

Ci = ^ y ijCj are linearly independent and (ii) if X y ,-ay = 0 is an?/ linear relation 
i = i 

connecting the a’s, and if c = 2) 7,ey, i/ien c belongs to the linear set (ci, 

* > C774 — r) . 

This result can be translated immediately in terms concerning the solution 
of a system of ordinary linear equations or in terms of matrices. If a,- = 2; 

i 

then (41) may be written 

tfllTl + ^2172 + * • • + dmiym = 0 

(44) 


ainYl + &2«Y2 + • * * + tfmnYm = 0 

a system of linear homogeneous equations in the unknowns 71, 72, • • • , y m . 
Hence (44) has solutions for which some 7; 5^ 0 if, and only if, the rank r of 
the array 


a n 

Oai 

* a m 1 

«I2 

a?* 

* • a m2 


(45) 


a 1 « (1*2 n * * * a mn 

is less then m and, when this condition is satisfied, every solution is linearly 
dependent on the set of in — r solutions given by (42) which are found by 
the method given in the discussion of Theorem 9. 

Again, if we make (45) a square array by the introduction of columns or 
rows of zeros and set A = || a l7 ||, c = 2 7 .a, then (41.) becomes A'c. = 0 and 
Theorem 10 may therefore be interpreted as giving the properties of the null- 
space of A' which were derived in §1.10. 



CHAPTER II 

ALGEBRAIC OPERATIONS WITH MATRICES. THE CHARACTERISTIC EQUATION 

2.01 Identities. The following elementary considerations enable us to carry 
over a number of results of ordinary scalar algebra into the algebra of matrices. 
Suppose /(Xi, X 2 , ■ • X r ), flrCXi, X* ■ • X r ) are integral algebraic functions of 
the scalar variables X* with scalar coefficients, and suppose that 

/(X i, X 2 , • • •? X r ) = flf(Xi, X 2 , • * *j X r ) 

is an algebraic identity; then, when /(X 1 , • • •, X r ) — </(Xi, * * *> X r ) is reduced 
to the standard form of a polynomial, the coefficients of the various powers of 
the X’s are zero. In carrying out this reduction no properties of the X’s are 
used other than those which state that they obey the laws of scalar multiplica- 
tion and addition: if then we replace Xi, X 2 , • • X r by commutative matrices 
X l9 32, • • •, Xr, the reduction to the form 0 is still valid step by step and hence 

fix i, X 2 , • • *, X r ) = Q(x 1, Xi, '"I X r ). 

An elementary example of this is 

(1 - x 2 ) = (1 - *)(1 + x) 

or, when xy = yx , 

x 2 - y 2 = (x — y)(x + y ). 

Here, if xy ^ yx, the reader should notice that the analogue of the algebraic 
identity becomes 

x 2 - y 2 = x(x + y) - (® + y)z/, 

which may also be written x 2 — ?/ 2 = (x — y)(® + 2/) + (?/r — •*’//). 

2.02 Matric polynomials in a scalar variable. By a rnatric polynomial in a 
scalar variable X is meant a matrix that can be expressed in the form 

(1) PM = PoX r + PlX r+1 + * • * + Pr (po ^ h), 

where p 0 , p i, . . ., p r are constant matrices. The coordinates of PM are scalar 
polynomials in X and hence, if 

(2) QM = qM + QM- 1 + ‘ ' • + q. (?o ^ 0) 

is also a matric polynomial, PM — QM if? and only if, r = s and the coeflieionls 
of corresponding powers of X are equal, that is, p { = (i = 1,2, . . . , r). If 
| q Q | ^ 0, the degree of the product P(X)Q(X) (or Q(X)P(X)) is exactly r + « since 
the coefficient of the highest power \ r+s which occurs in the product is /y/ () 
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(° r Mh) which cannot be 0 if p 0 0 and | q 0 1 0. If, however, both | p 0 1 and 

| q 0 1 are 0, the degree of the product may well be less than r + s, as is seen from 
the examples 

(cjjX l) (e^IX + 1) = C 11 C 22 X 2 4- (cn + C22)X + 1 = (fin + C 22 )X + 1, 



Another noteworthy difference between matric and scalar polynomials is that, 
when the determinant of a matric polynomial is a constant different from 0, its 
inverse is also a matric polynomial: for instance 

(Ci 2 X + l)" 1 = -~C\fX + 1, 

[ (< 3.12 4~ C2a)X + l]” 1 = CjsX 2 — (c,i2 4" Can) X -\- 1. 

We shall call such polynomials elementary polynomials. 

2.03 The division transformation. The greater part of the theory of the 
division transformation can be extended from ordinary algebra to the algebra 
of matrices; the main precaution that must be taken is that it must not be 
assumed that every element of the algebra has an inverse and that due allow- 
ance must be made for the peculiarities introduced by the lack of commuta- 
tivity in multiplication. 

In k ok him 1. If P(X) and Q(X) arc the polynomials defined, by (1) and (2), 
and if | vo | 0, there exist unique poly non dais S(X), A (A), Ai(X), I\\(X), of 

which S and Si if not zero, are of degree r — s and the degrees of It and It 1 arc 
s — 1 at most , such that 

R(X) ™ S(X)Q(X) + H(\) s Q(X)Ni(X) + h\(X). 

If r < s, we may take S\ = S = 0 and R 1 = It — P ; in so far as the existence 
of these polynomials is concerned the theorem is therefore true in this case. 
We shall now assume as a basis for a proof by induction that the theorem is 
true for polynomials of degree less than r and that r < s. Since [ r/o | ^ 0, 
<l\) 1 exists and, as in ordinary scalar division, we have 

P(X) ~~ ?>o^)™ I X r “ *Q(X) — (pi — pD</ o~ J </i)X r "" 1 4“ ••• — 

Since the degree of I\ is less than r, we have by hypothesis P i(X) == P^(X)Q(X) 
+ R(X), the degrees of 1\ and Li being less, respectively, than r — s and s; 
hence 

P(X) = (p 0 fjb~ l X'-' + P S (X))Q(X) + Ji(X) = S(X)Q(X) + K(X) 

as required by the theorem. The existence of the right hand quotient and 
remainder follows in the same way. 

It remains to prove the uniqueness of S and R . Suppose, if possible, that 
P = SQ + R = TQ + U where It and S are as above and T, U are poly- 
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normals the degree of U being less than s; then (S — T)Q - U 
S - T 0, then, since | q 0 \ 0, the degree of the polynomial {b j)J - 

at least as great as that of Q and is therefore greater than the degree oi U /' ■ 
It follows immediately that S — T = 0, and hence also U v ’ w uc ' 1 
completes the proof of the theorem. _ 

If Q is a scalar polynomial, that is, if its coefficients q are scalars, then > n 
R = jR x ; and, if the division is exact, then Q(X) is a factor of each of the cook i 
nates of P(X). 

Theorem 2. J/ the matric 'polynomial (1) is divided on the nght by X a, 
the remainder is 

Po a T + pia r " 1 + • * * + ?>r 

and , if it is divided on the left , the remainder is 

a r p Q + a T ” • + Pr. 

As in ordinary algebra the proof follows immediately from the identity 

X* — a 5 = (X — a)(X* ” 1 + X s " 2 a + • • * + a s “ x ) 

in which the order of the factors is immaterial since X is a scalar. 

If P(X) is a scalar polynomial, the right and left remainders are the same and 
are conveniently denoted by P(a). 

2.04 Theorem 1 of the preceding section holds true as regards the existence 
of S, S h R, Ri, and the degree of R, Ri even when | j 0 1 =0 provided J Q(X) | 

9 * 0. Suppose the rank of qo is t < n; then by §1.10 it has the form otiSpi 

i 

y) eujk where h and k are non-singular matrices for which h<\ - on, 

n 

k'ei = Pi (i = 1, 2, • • *, t). If Ci = eu, then 

t +i 

(3) Qi = (ftX + l)h~Hi 

is a polynomial whose degree is not higher than the degree s of Q since C\h h/ 0 
= 0 so that the term in X s +1 is absent. Now, if rj = | hr 1 1, then 

\Qi\ = | CiX + 1 | | /i" 1 1 | Q | = (l + X^-^lQl, 

so that the degree of | Qi | is greater than that of | Q | by n — t. If the hauling 
coefficient of Q i is singular, this process may be repeated, and so on, giving 
Qi, Q 2 , • • *, where the degree of | Qi | is greater than that of | Qi - i |. But 
the degree of each Qi is less than or equal to s and the degree of the determinant 
of a polynomial of the sth degree cannot exceed ns. Hence at some stage the 
leading coefficient of, say, Qj is not singular and, from the law of formation (T) 
of the successive Q 1 s, we have Qj(\) = H(X)Q(\), where //(X) is a mafric 
polynomial. 


or, say, h[ 
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By Theorem 1, Qj taking the place of Q, we can find S* and It, the latter of 
degree s — 1 at most, such that 

P(X) = S*(X)ff(X)Q(X) + R(\) = 5(X)Q(X) + P(X). 

The theorem is therefore true even if | q Q | = 0 except that the quotient and 
remainder are not necessarily unique and the degree of S may be greater than 
r — $, as is shown by taking P = X 2 — 1, Q = en\ + 1, when we have 

P = (c’ooX 2 -f* CiiX — 1)Q = (C22X 2 + tfnX — 1 612) Q — G12. 

2.05 The characteristic equation. If £ is a matrix, the scalar polynomial 

(4) /(X) = | X — x | = X w + a{k n " 1 + • • • + (x n 

is called the characteristic function corresponding to x. We have already seen 
(§1.05 (15)) that the product of a matrix and its adjoint equals its deter- 
minant; hence 

(X - x) adj (X x) = | X - x | = /(X). 

It follows that the polynomial /(X) is exactly divisible by X — x so that by 
the remainder theorem (§2.03, Theorem 2) 


(5) /(*) = 0. 

As a simple example of this we may take x = 


a ft 
7 8 


Here 


/(X) - (X - «)(X - 8) - ft 7 = x 2 - (a + 5)X + ad - fty, 


and 


a 2 + fty aft + ft 8 
7 a + <57 7/3 + < 5 2 


— (cv + 5 ) 


CY ft 

7 0 


+ (ad — fty) 


l 0 
0 1 


= 0 . 


The following theorem is an important extension of this result. 


Theorem 3. ///(X) = | X — x | and 0(\) is the highest common factor of the 

first minors of | X — x |, and if 

((>) <p(\) = /(X)/0(X), 

the leading coefficient of 0(\) being 1 (and therefore also that of <^(A)), then 


(i) cf(x) = 0; 

(ii) if 1 p(X) is any scalar ; polynomial such that \h(x) = 0, then <^(X) is a factor 
of \ P(\)j that is, <p(\) is the scalar polynomial of lowest degree and with leading 
coefficient 1 such that <p(x) =0; 

(iii) every root of /(X) is a root of <p(\). 


The coordinates of adj(X — x) are the first minors of | X — x 
by hypothesis [adj (X — x)\/Q(\) is integral; also 


adj(X — x) 

“(X) 


(x - x) = 


m 

e(\) 


lienee <p(x) = 0 by the remainder theorem. 


<?(X); 


and therefore 
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If ^(X) is any scalar polynomial for which \p{%) = 0, we can find scalar poly- 
nomials Ikf(X), N(\) such that M(\)<p(\) + iV(X)^(X) ss f(X), where <; (X) is the 
highest common factor of <p and \p. Substituting x for X in this scalar identify 
and using <p(x) = 0 = ^(rc) we have {(x) = 0; if, therefore, \p(x) = 0 is a scahu 
equation of Lowest degree satisfied by x, we must have ^(X) = f(X), apart Irom 
a constant factor, so that ^(X) is a factor of <p(X), say 

(7) *(x) = Hm\). 

Since \p(x) = 0, X — rr is a factor of ^(X), say *^(X) = (X — x)g(X) } whore f/ is a 
matric polynomial; hence 

■ m - sH) - (x - I)9(x) - 

Hence 

n m = ^ adj(X - s) 

6>(X)A(X)(X - ») tf(X)A(X) 

and this cannot be integral unless /i(X) is a constant in view of the fact that 
0(X) is the highest common factor of the coordinates of adj (X — x); it follows 
that ^(X) differs from <p(\) by at most a constant factor. 

A repetition of the first part of this argument shows that, if \p(z) = 0 is any 
scalar equation satisfied by x, then <p(X) is a factor of 
It remains to show that every root of /(X) is a root of <p(\). If Xi is any root 
of /(X) = | X — x |, then from <p(X) = gr(X) (X — x) we have 

p(X i) = ff(X 0(Xi - x) 

so that the determinant, [<p(Xi)] n , of the scalar matrix equals | g(\ i) | |Xi — x |, 
which vanishes since | Xi — x | = jf(Xi). This is only possible if v>(Xi) = 0, (bat 
is, if every root of /(X) is also a root of 
The roots of /(X) are also called the roots 1 of x, <^(X) is called ( he reduced 
characteristic junction of x, and <p(x) = 0 the reduced equation of x. 

2.06 A few simple results are conveniently given at this point although they 
are for the most part merely particular cases of later theorems. If g(\) is a 
scalar polynomial, then on dividing by </?(X), whose degree we shall denote 
by v , we may set g(\) = g(X)<p(X) + r(X), where q and r are polynomials (.he 
degree of r being less than v. Replacing X by x in this identify and remembering 
that <p(x) = 0, we have 2 g(x) — r(x), that is, any polynomial can be replaced 
by an equivalent polynomial of degree less than v. 

1 They are also called the latent roots of x. 

2 If g(\) is a matric polynomial whose coefficients are not all commutative with .r, the 
meaning of g(x) is ambiguous; for instance, x may be placed on the right of the eocllirienfs, 
or it may be put on the left. For such a polynomial we can say in general that it ran he 
replaced by an equal polynomial in which no power of x higher than the (r - 1 ) (,h oeeura. 
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If g(\) is a scalar polynomial which is a factor of say < p(X) = h(X)g(X), 
then 0 = <p(x) = h(x)g(x). It follows that | g(x) | = 0; for if this were not so, 
we should have h(x) = [g(x)]~~ l (p(x) = 0, whereas x can satisfy no scalar equa- 
tion of lower degree than <p. Hence, if g(\) is a scalar polynomial which has a 
factor in common with (p(x) f then g(x) is singular. 

If a scalar polynomial g(\) has no factor in common with <p( A), there exist 
scalar polynomials ikf(A), N(X) such that M(\)g(X) + N(X)<p(\) s= 1. Hence 
M(x)g(x) = 1, or = M(x). It follows immediately that any finite 

rational function of x with scalar coefficients can be expressed as a scalar 
polynomial in x of degree v — 1 at most. It should be noticed carefully how- 
ever that, if x is a variable matrix, the coefficients of the reduced polynomial 
will in general contain the variable coordinates of x and will not be integral 
in these unless the original function is integral. It follows also that g(x) is 
singular only when g(\) has a factor in common with ^(A). 

Finally we may notice here that similar matrices have the same reduced 
equation; for, if g is a scalar polynomial, g(y~~ l xy) = y~~'g(x)y. As a particular 
case of this we have that xy and yx have the same reduced equation if, say, y is 
non-singular; for xy = y~~ l -yx-y. If both x and y are singular, it can be shown’ 5 
that xy and yx have the same characteristic equation, but not necessarily the 
same reduced equation as is seen from the example x = cm, y = e®. 

2.07 Matrices with distinct roots. Because of its importance and com- 
parative simplicity we shall investigate the form of a matrix all of whose roots 
are different before considering the general case. Let 

(X) /(A) = | A - x ]; = (A - A,) (A - A,) • ■ • (A - A „) 

• • (X - A J ^ /(A ) ) 

■ • (X, ■■■' A„) A - A, 

1 ; hence 

(10) fi (x) +M.r) + +Ur) - L. 

Further, /(A) is a factor of //(A)/,(A) (i t 6 j) so that 

( 11 ) = 0 

homo, multiplying (10) hyffx) and using (11) we have 

(12) LACOl 2 = /iW- 

Again, (A - Ad// (A) = /(A)//' (A/); hence (x - Ad/tCr) = 0, that is, 

(IX ) xf i (x) = A if i (x ) , 

:1 For example, by replacing y by y -j- 3, 5 being a scalar, and considering the limiting 
ease when <5 approaches 0. 


where no two roots are equal and wet. 

(A* — A|) • * * (A/ “ X, l ) t A £ — A,; ) i ) 
By the Lagrange interpolation formula, Z-/ /<:( A) 
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whence, summing with regard to i and using (10), we have 

(14) x = + X 2 / 2 (#) + * ' • + 

If we form x r from (14), r being a positive integer, it is immediately seen Ironi 
(11) and (12), or from the Lagrange interpolation formula, that 

(15) s' = VJi + X 2 / 2 + • * ‘ + Kfn, 

where /i stands for/i(.r), and it is easily verified by actual multiplication thal , 
if no root is 0, 

x~ l = X x l fi + X 2 l / 2 “h * * ' + X n f n 

so that (15) holds for negative powers also. The matrices /»• are linearly inde- 
pendent. For if Xjifi = 0, then 

0 = fiSjifi = 7 if) = 7 ifi 

whence every 7; = 0 seeing that in the case we are considering ./(X) is i I sell 
the reduced characteristic function so that fj(x) ^ 0. 

From these results w r e have that, if g(\) is any scalar rational function whose 
denominator has no factor in common with <p(X), then 

(16) g(x) = g(ki)fi + 0(X 2 )/ 2 + * * • + </(X 

It follows from this that the roots of g(x) are g(\;) (i = 1, 2, • • •, n). For 
setting y = g(x), m = g(Xi), we have as above 

m = tyfadu 

\KX) being a scalar polynomial. Now \p(y)fi = ^(gd/ij hence, if i/(//> 0, 

then also ^(/x<) = 0 (i = 1, 2, • •*, /i) ; and conversely. I1 (uh*.<^ if tin* notation 
is so chosen that n 1, /x 2 , • • •, ju r are the distinct values of g*, the reduced charnc- 

r 

teristic function of y = g(x) is n e - w). 


2.08 If the determinant | X - a: | = /(X) is expanded in powers of X, it is easily 
seen" that the coefficient « r of X“ " f is (-l) r times the sum of the principal 
minors of x of order r; this coefficient is therefore a homogeneous polynomial of 
degree r in the coordinates of x. In particular, -a, is the sum of the coordi- 
nates in the main diagonal: this sum is called the trace of x and is detailed 
by tr x. 

If y is an arbitrary matrix, y a scalar variable, and s = x + y //, (.ho coclli- 
cients of the characteristic equation of z, say 

( 17 ) 2" + h z n ~ 1 + • ■ • + b n = 0, 

4 For instance, by differentiating | X - * | n - r times with respect, to X ami l lica net- 
ting X — 0. 
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are polynomials in p of the form 

(18) 6 S = a s0 + pa s i + • • • + p‘a as , (a s0 = a„ o 0 o = 1) 

and the powers of z are also polynomials in p, say’ 


(19) 


X r + p 


r — 1 ] 


+ p‘ 


x y 
r — 2 2 


+ 


+ u r r 


J x y 1 

where ^ ' j is obtained by multiplying s *’s and t y’s together in every possi- 
ble way and adding the terms so obtained, e.g., 

2 = x2y + xyx + V x *' 

If we substitute (18) and (19) in (17) and arrange according to powers of p, 
then, since p is an independent variable, the coefficients of its several powers 
must be zeto. Ihis leads to a series of relations connecting x and y of the form 


( 20 ) 




a- y 

s - i + j $ - j 


; =0 (s = 0, 1, 2, •••) 


where a ti are the coefficients defined in (18) and 
replaced by 0 when j > s. In particular, if « = I, 


1 — I - J s 


IS 


s V 

n — 1 1 


+ (h 


U 


+ 


\n — 2 1 

which, when xy = yx, becomes 

f(x)y = ‘ 1 + 


+ t y + rt n .r'' 1 + • ■ • + a ni = 0 


+ a„i) = <j(x). 


When a lias no repeated roofs, /'(A) has no roof in common with /(A) and /'(.<■) 
has an inverse (cf. §2.06) so that y = g(x) //'(/) which can be expressed as a 
scalar polynomial in :r; and conversely every such polynomial is commutative 
with x. We therefore have the following theorem: 

1 iiuoituM 4. If x has no multiple root s, the only matrices commutative with 
it arc scalar 'polynomials in x. 

2.09 Matrices with multiple roots. We shall now extend the main results 
of §2.07 to matrices whose roots are not necessarily simple. Suppose in the 
first place that a has only one distinct root and that its reduced characteristic 
function is <p(\) = (A - A,)”, and set 


then 


vl = Vi = Cr - A,)' : = (x - A,>?i -1 (i = 1, 2, 
V i = 0, Xrjvi — \iVif-i, Xy i = XiVi + Vi + i 


i); 


(* = 1,2, 2) 
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and 

2 s = (Xi + V i) a = k + 

where the binomial expansion is cut short with the term rjp 1 since 17" = 0. 
Again, if g(k) is any scalar polynomial, then 

g{x) = g(\ 1 + 771) = g(\i) + g'(\i)m + * * • + y Vi 

It follows immediately that, if is the first derivative of g( X) which is not 

0 when X = Xi and.(« — l)s < v < ks, then the reduced equation of g(x) is 

[g(x] - gMY = 0. 

It should be noted that the first v — 1 powers of 771 are linearly independent 
since <p(\) is the reduced characteristic function of x. 

2.10 We shall now suppose that x has more than one root. Let the reduced 
characteristic function be 

r 

(21) <p(\) = fj (A - (>, = v, r > 1) 

i - 1 

and set 

(22) hi(\) = *(X)/(X » \t)\ 

We can determine two scalar polynomials, M t -(X) and iV,:(X), of degrees not 
exceeding vi — 1 and v — v- L — 1, respectively, such that 

+ (X — \i) Vi Ni{\) s 1 , A/ /(A*) 5^ 0 . 

If we set 

( 23 ) ^<(X) = 

then 1 — 2<pi(\) is exactly divisible by <^>(X) and, being of degree v I a I most, 
must be identically 0; hence 

( 24 ) 2 ^(X)= 1 . 

1 

Again, from (22) and ( 28 ), v { X) is a factor of *,(XVy(X) (/ ^ j) and hence on 
multiplying ( 24 ) by <p,(X) we have 

( 25 ) [<^<(X)] 2 = (pi(X)ip j (X) = 0 , mod . y?(X) (/ ^ _/). 

Further, if gr(\) is a scalar polynomial, then 

r 

gCx) = 2) pPO^a) 

( 26 ) 

= 2 ) + 0 '( X i)(X - Xi) + • • • + (X - X.y* '| V »,-(X) + /)’ 
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vanishes when x is substituted for A. ?J VGALOH e 

L. 1 If we put * for A in (23) and set Vt for Vt (x), then (24) and (25) show that 

r 

< ' 27 ' ) = **> W = 0 (* 5* j), ^<Pi= I. 

1 

It follows as in §2.07 that the matrices Vi are linearly independent and none is 
/eio, since ^(A, : ) ^ 0 so that ,<X) is not a factor of W (X), which would be the 
c.a,se were Vi (x) = 0. We now put x for A in (26) and set 

(28) 


Vi = (x - A ()(?,• (i = 1, 2, • - r ). 


Since the r, : th power of (A - A,)^,-(A) is the first which has v (\) as a factor, „ 
is a n.lpotent matrix of index V{ (cf. §1.05) and, remembering that m 
we have 1 * ’ 

(29) n * = (* ~ X ')Vi ^ 0 (j < Vi), nifPi = rn = 

a Vi = X.Vi + Vi, XV i = Xiv i + 17 < 1 ', 
equation (26) therefore becomes 


(31) 


f/(,r) = 2 


r/(^/v» + + • • • + '^ r 


itiu i in particular 
C->2.) 


+ ?7/) 


l.r/. 


Hio mill. rices and Vi a-ro called (lie pr/Tr/>// idnujofnil and ///7/w/r/// 
r/r;//r//A ol .r corresponding (o (lie root A, The matrices v *, are uniquely deter- 
i m nod by I he iollowi» K conditions: if (/ = 1 , 2, r) are a,ny matrices 
such t hat 

(i) .n/'/ ~ i//,.r, 

(ii) (.r — A/)^/ is n i [pot ( n(, 

(hi ) V , ^ j ~ 1 , \p ^ / r- ( 1 ^ 

lb< " v ' “ v "' (' ~ *' ^ e. I' or let 0 U = from (i) <?,-• also equals 

9, Vi- 1' nun ( ti ) and (2S) 

Vi = uv, - A, Vf , £ ; - - ,r^ y - A j\f/ j 

me hot h nilpotent and, since V : and Vi are polynomials in x, they are com mu- 
ddivo with i pj and therefore with £,•; also 

= + (*r ~ X/V/tAy = X f - 0 £y + Vi^j 

\i°U " 1 “ (*r Xy V/ 1 /''; = Xy 0 j y ~f fy^i- 


IISC Lib B’lore 


S/4Z 


5 1 42 
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Hence (X< - X,)0 iy = - wfc. But if g is the greater of the indices of $/ 

and 7 }i , then, since all the matrices concerned are commutative, each term of 
(fcy* ~ i7»lfc) 2/£ contains ^ or 7^ as a factor and is therefore 0. If (hi & 0, < his 
is impossible when i j since On is idempotent and X; — Xj ^ 0. Ihaicui 
c pjfrj = 0 when i ^ j and from (iii) 

= «c>/ 

which proves the uniqueness of the (p’s. 

2.12 We shall now r determine the reduced equation of g(x). If we set <n for 
g{x)(pi , then 

(34) Qi = + g f (Xi)rji + * • • + ^ — jyy” 1 

= g(Xi)<Pi + f»-, 

say, and if is the order of the first derivative in (34) which is not 0, then 
ft* is a nilpotent matrix whose index hi is given by hi = 1 < vi/si < hi. 

If $(X) is a scalar polynomial, and y < = ^(X t -), 

$G70*0) = 2/ = 2 ^(t<)^< + d^(Tt)ft + * * * + ** ~ 1 

so that <£(g(.r)) = 0 if, and only if, g(Xi) is a root of <h(X) of multiplicity ft**. 
Hence, if 

'I'(X) = n[X - gQu)]*< 

where when two or more values of i give the same value of r/(X t ), only that oiks 
is to be taken for which hi is greatest, then 4>(X) is the reduced characteristics 
function of g(x). As a part of this result we have the following theorem. 

Theorem 5. If g(X) is a scalar polynomial and x a matrix whose distinct 
roots are Xx, X 2 , • • *, X r , the roots of the matrix g(x) arc 5 

gM, g(h), • • *, g(x r ). 

If the roots g(Xi) are all distinct, the principal idempotent elements of <j(x) 
are the same as those of x; for condition (33) of §2.1 1 as applied to g(x) are satis- 
fied by <pi (i = 1, 2, r), and these conditions were shown to characterize 

the principal idempotent elements completely. 

2.13 The square root of a matrix. Although the general question of functions 
of a matrix will not be taken up till a later chapter, it is convenient to give 
here one determination of the square root of a matrix x. 

5 That these are roots of g{x) follows immediately from the fact that X - x is a factor 
of g(\) — g(z); but it does not follow so readily from this that the only roots are (hosts 
given except, of course, when r = n and all the quantities (/(X;) arcs distinct. 

,vi ■ 
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If a and P are scalars, a ^ 0, and (a + P) i is expanded formally in a Taylor 
series, 


(a + 0)* — 



then, if 
(35) 


V - 1 

= ad 8 r (fi/a) r , it follows that 
o 


S; = a + p + a 74 


where 24 is a polynomial in p/a which contains no power of p/a lower than the 
rth. If a and b are commutative matrices and a is the square of a known non- 
singular matrix a\ then (35) being an algebraic identity in a and P remains 
true when a and b are put in their place. 

If Xi = X i<pi + rji is the matrix defined in §2.11 (32), then so long as X» ^ 0, 
we may set a = X#>*, = 77; since X^i = (X^) 2 ; and in this case the "Baylor 

series terminates since rj]* = 0, that is, 24* = 0 and the square of the terminating 
series for (Xnpi + r yi - )* in powers of m equals X<^< + It follows immediately 
from (32) and (27) that, if # is a matrix no one of whoso roots is 0, the square 
of the matrix 


( 35 ) 




Vi + 1 Vi ~ 

+ (-D' i ' 


(2* -4)! 

&’*-•&* - 2)"l(* - 1J! 


, »i- 1' 


is .r. 

If the reduced equation of x has no multiple roofs, (30) becomes 
(47) *» = 2A \ Vi 


and this is valid even if one of the roots is 0. If, however, 0 is a multiple root 
of the reduced equation, :r may have no square root as, for example, the 
J . 0 i 

maim (J () 

Formula, (35) gives 2 r determinations of :r* but we shall see later that an 
infinity of determinations is possible in certain eases. 

2.14 Reducible matrices. If x - x { + ;r 2 is the direct sum of xi and x 2 and 
r h r 2 are the corresponding idempotent elements, that is, 


€{X — Xij CiC j 0 (i j j i) j 1 , 2 ), 

then x r = ,r[ + xl (r > 2) and we may set as before 1 = x° == xl + xl = e L + e 2 . 
Hence, if /(X) = X™ + b t X m “ 1 + * • • 4* b M is any scalar polynomial, we have 


f(x) = ej(x i) + c 2 /fe) = /fo) + /fe>) - 6m, 
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and if g(\) is a second scalar polynomial 

f(z)g(z) = eifixjgfa) 4 e 2 f(x 2 )g(x 2 ). 

Now if fi(X) is the reduced characteristic function of x i regarded as a matrix 
in the space determined by e,-, then the reduced characteristic function of Z{ 
as a matrix in the original fundamental space is clearly unless X is a 

factor of fi(\) in which case it is simply /<(X). Further the reduced character- 
istic function of x = xi + x* is clearly the least common rnulliple of j\{\) and 
/ a (X); for if 

i£(x) = /i(x)^i(X) = f 2 (x)g o(X) 

then 

^( a ’i + a’a) = Ci\p(xi) + c 2 ^(.r 2 ) 

= + e«f z (x2)g 2 (x 2 ) ~ 0 . 
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INVARIANT FACTORS AN1) ELEMENTARY DIVISORS 

3.01 Elementary transformations. By an elementary transformation of a 
matric polynomial a(\) = || a,*y || is meant one of the following operations on 
the rows or columns. 

Type I. The operation of adding to a row (column) a different row (column) 
multiplied by a scalar polynomial 0(A). 

Type II. The operation of interchanging two rows (columns). 

Type III. The operation of multiplying a row (column) by a constant 

k 7 £ 0. 

These transformations can be performed algebraically as follows. 

Type /. Let 

Tij = 1 + 0(A )('u (i ^ j), 

0(A) being a scalar polynomial; then | /\ ; | = 1 and 

I ij.fi j R i>t/‘ {><i + @ / j n 

/» . 7 7 

which is the matrix derived from a( A) by adding 0 times theyth row to the /th. 
The corresponding ope rat ion on the columns is equivalent to forming the 
product. 

Type II. Lot Q;j bo the matrix 

Qij "= • ('ii ~ <'jj + ( 'ij + <')i (/' j) 

that is, Q,-j is (he matrix derived from the identity matrix by inserting 1 in 
place of 0 in the coefficients of e,j and ej and 0 in place of 1 in the coefficients 
of ('a and r ;/ ; (turn | Qij | = — 1 and 

v \x . \a , 

hhyO' f ^j (l /n/ 1 in/ j <</ ill j R ~I~ ' > ^ j </ i </ l / j R 7 // j <n 

l> 1 7 7 7 7 7 

thal is, hb/HS derived from a by interchanging the /th and t /fh rows. Similarly 
aQij is obtained by interchanging the /th and ,/th columns. 

Since any permutation can be effected by a succession of transpositions, the 
corresponding transformation in the rows (columns) of it matrix can be pro- 
duced by a succession 1 of transformations of Type II. 

Type III. This transformation is effected on the rfh row (column) by multi- 
plying on the left (right) by R = 1 + (k — l)e,,.; it, is used only when it is 
convenient to make the leading coefficient in some term equal to 1. 

1 The transformation corresponding to the substitution ( ,! ■ r ) * s q - y) c ‘»i- 
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The inverses of the matrices used in these transformations are 

PI) = 1 - Be ih Q7J. = Q iit R~ l = 1 + (Jr 1 - Derr; 

these inverses are elementary transformations. The transverses are also ele- 
mentary since P • j = P Sij and Qn and R are symmetric. 2 

A matric polynomial 6(X) which is derived from a(X) by a sequence ol <Tv 
mentary transformations is said to be equivalent to a (\) ; every such poly- 
nomial has the form p(\)a(\)q(\) where p and q are products of elementary 
transformations. Since the inverse of an elementary transformation is ele- 
mentary, o(X) is also equivalent to 6(X). Further, the inverses of p and q 
are polynomials so that these are what we have already called elementary 
polynomials; we shall see later that every elementary polynomial can be 
derived from 1 by a sequence of elementary transformations. 

In the following sections we require two lemmas whose proofs are almost, 
immediate. 


Lemma 1. The rank 3 of a matrix is not altered by an elementary transformation. 
For if | P | 0, AP and PA have the same rank as A (§1.10). 

Lemma 2. The highest common factor of the coordinates of a matric polynomial 
is not altered by an elementary transformation. 

This follows immediately from the definition of elementary transformations. 

3.02 The normal form of a matrix. The theorem we shall prove in this sec- 
tion is as follows. 


Theorem 1. If a(X) is a matric polynomial of rank r, it can hr reduced by 
elementary transformations to a diagonal matrix 


(1) 2 a<(X)e« = 


«i(X) 


0(2 (X) 


*r(X) 


P(\)a[\)Q(\), 


0 

2 The definition of an elementary transformation given above is the most convenient 
but not the only possible one. All three transformations have the form T « I -p n S'// 
with the condition that 1 + Sxy is not 0 and is independent of X. 

3 By the rank of a matric polynomial is meant the order of the highest minor whirl) does 
not vanish identically. For particular values of X the rank may he smaller than r there 
are always values of X for which, it equals r and it cannot he greater. 
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where the coefficient of the highest power of X in each polynomial afX) is 1, on is a 
factor of ai -|- 1 , • * *, a r (i = 1, 2, ••• , r — 1), and P( X), Q(X) are elementary 
polynomials. 

We shall first show that, if the coordinate of a(X) of minimum degree m, say 
is not a factor of every other coordinate, then a(X) is equivalent to a matrix 
in which the degree of the coordinate of minimum degree is less than m. 

Suppose that a vq is not a factor of a pi for some i; then we may set a p i = 
t3a pq + a' t n where f3 is integral and a pi is not 0 and is of lower degree than m. 
Subtracting ft times the r/th column from the ith we have an equivalent matrix 
in which the coordinate 4 (p } i) is a' pi whose degree is less than m. The same 
reasoning applies if a pq is not a factor of every coordinate a iu in the gth column. 

After a finite number of such steps we arrive at a matrix in which a coordinate 
of minimum degree, say k vq , is a factor of all the coordinates which lie in the 
same row or column, but is possibly not a factor of some other coordinate ft;/. 
When this is so, let k p] - = ftk pq , ki q = yk pq where ft and y are integral. If we 
now add (I — /3) times the #th column to the jth, (p, j) and (f, j) become 
respectively 

k<pj ™ ft P j + (1 /5) ft p q ~ kpqj ft;/ ^ ft;/ (1 (3) ft - ; q ft;/ "4" (1 (3)yk p q. 

Here either the degree of ft-/ is less than that of k pq , or k PJ - has the minimum 
degree and is not a factor of ft-/, which lies in the same column, and hence the 
minimum degree can be lowered as above. 

The process just described can be repeated so long as the coordinate of lowest 
degree is not a, factor of every other coordinate and, since each step lowers the 
minimum degree, we derive in a finite number of steps a, matrix || h// || which 
is equivalent to a(X) and in which the coordinate of minimum degree is in fact 
a, divisor of every other coordinate; and further we may suppose that b' n — 
tti(X) is a coordinate of minimum degree and set b u yib' n , b J{ — o/h,,. Sub- 
tracting y,: times the first column from the /t.h and then <5/ times the first row 
from thojth (■/ , j = 2, 3, • • •, n) all the coordinates in the first row and column 
except b' u become 0, and we have an equivalent matrix in the form 


«,(X) 0 0 ••• t) 

0 b 22 b 2 ;{ * • • h>n 

( 2 ) 0 bw> bn • * * b;i n 

0 b n 2 b n ;\ • * • b nn 


in which «i is a factor of every &;/. The coefficient of the highest power of X 
in a i may be made 1 by a transformation of type III. 

The theorem now follows readily by induction. For, assuming it is true for 

4 That is, the coordinate in the pth row and ith column. 
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matrices of order n — 1, the matrix of this order formed by the b’s in (2) cun 
be reduced to the diagonal matrix 

“2 00 

a«(X) 


a«(X) 


0 


0 

where the ads satisfy the conditions of the theorem and each has q'i as a factor 
(§3.01, Lemma 2). Moreover, the elementary transformations by which Ibis 

reduction is carried out correspond to transformations affecting the lust n . 1 

rows and columns alone in (2) and, because of the zeros in the first row and 
column, these transformations when applied to (2) do not affect its first row 
and column; also, since elementary transformations do not affect the rank 
(§3.01, Lemma 1), s equals r and a(A) has therefore been reduced to tins form 
required by the theorem. 

The theorem is clearly true for matrices of order 1 and hence is true for 
any order. 

Corollary. A matric polynomial whose determinant is independent of A and is 
not 0, that is, an elementary polynomial, can be derived from 1 by the product 
of a finite number of elementary transformations. 

I he polynomials a, are called the invariant factors of a(A). 


3.03 Determinantal and invariant factors. The determinantal. factor of the 
sth order, D„ of a matric polynomial o(X) is defined as the highest common 
factor of all minors of order s, the coefficient of the highest power of A being 
taken as 1. An elementary transformation of type 1 either leaves ;i given 
minor unaltered or changes it into the sum of that minor and a mulliplc of 
another of the same order, and a transformation of type II simply permules 
the minors of a given order among themselves, while one of type III merely 
multiplies a minor by a constant different from 0. Hence equivalent matrices 
have the same determinantal factors. Bearing this in mind we see immediately 
from the form of (1) that the determinantal factors of «(A) are given by 

D. = ana, • • • «. (.5 = 1, 2, • • •, r), /) , = () (., > r ) 

so that 5 

a s = DJD a 

The invariant factors are therefore known when the determinantal factors are 
given, and vice versa. 


6 Since cx s _ 


is a factor of <x 8 , it follows that also D] is a factor of I), _ , /> 


-i i. 
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The definitions of this and the preceding sections have all been made relative 
to the fundamental basis. But we have seen in §1.08 that, if a x is the matrix 
with the same array of coordinates as a but relative to another basis, then 
there exists a non-singular constant matrix b such that a = b~ l a\b so that a 
and a.L are equivalent matrices. In terms of the new basis a x has the same 
invariant factors as a does in terms of the old and a, being equivalent to a x , 
lias therefore the same invariant factors in terms of the new basis as it has in 
the old. Hence the invariant and determinantal factors of a matric poly- 
nomial are independent of the (constant) basis in terms of which its coordi- 
nates are expressed. 

The results of this section may be summarized as follows. 

Theorem 2. Two matric polynomials are equivalent if, and only if, they 
have the sa/me invariant factors. 

3.04 Non-singular linear polynomials. In the case of linear polynomials 
Theorem 2 can be made more precise as follows. 

Theorem 3. If a\ + b and. cX + d are non-singular linear polynomials which 
have the same invariant factors, and if | c | ^ 0, there exist non-singular constant 
matrices p and q such that 

p(aX + b)(j = c\ + (I. 

Wo have seen in Theorem 2 that there exist elementary polynomials P(X), 
Q( X) such (hat 

(3) cX -f- d = I>(X)(a.X T b)Q( X). 

Since | c | ^ 0, we can employ the division transformation to find matric poly- 
nomials p i, q\ and constant' matrices p, q such that 

P(X) = (cX + d)p { + p, Q(X) = qfcX + d) + q. 

Using this in (3) we have 

(!) cX T d = p(aX + b)({ T (cX T- d)p\{aX -f - b)Q T P(aX T b)q\(cX -j- d) 

— (cX + d)pi(aX + b)q\ (cX + d) 

and, since from (3) 

(aX T - b)Q — P'~ l (cX -f- d) , P(a,X T - b) = (cX T d)Q J , 
we may write in place of (4) 

, r p(aX + b)q — [1 — (rX + d)(piP~ l + Q~~ l qi — pi(a\ + b)qi)](cX + d) 

= [1 — (cX -f- d) R\ (cX + cl) 

where K = pPr 1 + Q~h/i, — p i(a\ + b)cji, which is integral in X since P and Q 
are elementary. If R ^ 0, then, since | c | ^ 0, the degree of the right side 
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of (5) is at least 2 , whereas the degree of the left side is only I; \miw H 0 
so that (5) gives p(a\ + b)q = cX + d. Since cX + d is not singu nr, nut \n 
p nor q can be singular, and hence the theorem is proved. 

When \c \ = 0 (and therefore also | a | = 0) the remaining conditions of 
Theorem 3 are not sufficient to ensure that we can find constant matrices in 
place of P and Q, but these conditions are readily modified so as to apply to 
this case also. If we replace X by \/p and then multiply by p, a\ + l> is replaced 
by the homogeneous polynomial aX + bp] and the definition of invariant factors 
applies immediately to such polynomials. In fact, if \ct \ T /- 0, the invariant 
factors of a\ + bp are simply the homogeneous polynomials which arc equiva- 
lent to the corresponding invariant factors of a\ + b. if, however, | a | y 0, 
then | a\ + bp | is divisible by a power of p which leads to factors o( the lorni 
p { in the invariant factors of a\ + bp which have no counterpart in those 
of a\ -f- b. 

If | c | = 0 but | cX + d | m 0, there exist values, X, & 0, Pu such that 
| cAi + dp i | ^ 0 and, if we make the transformation 

(6) X = Aia, p = pia + (3, 

a\ + bp, cX + dp become a * a + bi/3, C\a + d\fi where a>\ - a\\ 4 hp u r , 
cAi 4- dpi, and therefore | c\ | ^ 0. Further, when a\ 4- bp and cX I dp have 
the same invariant factors, this is also true of 4“ M ami ma 4 d i/L Sima* 

| Ci | 0, the proof of Theorem 3 is applicable, so that there are constant, 

non-singular matrices p, q for which p{a x a 4~ W)</ = + dfi, and on revers- 

ing the substitution (6) we have 

p{ak + bp)q = cX 4“ dp. 

Theorem 3 can therefore be extended as follows. 

Theorem 4. If the non-singular polynomials aX + bp, rX | dp hart 1 (hr 
same invariant factors , there exist non-singular constant matrices p, g such that 
p(a\ + bp)q = c\ + dp. 

An important particular case of Theorem 3 arises when the polynomials 
have the form X — b, X — d. For if p(X - b)q = X -- d, on equating coeffi- 
cients we have pq = 1, pbq = d; hence b = p~~ l dp, that is, h and d arc similar. 
Conversely, if b and d are similar, then X - 6 and X - d are equivalent, and 
hence we have the following theorem. 

Theorem 5. Two constant matrices b, d are similar if, and only if, X - h 
and X — d have the same invariant factors. 

3.05 Elementary divisors. If D = | aX + 6 | is not, identically zero and 
if Xi, X 2 , • • *, X s are its distinct roots, say 

D = (\- Xi)*(X - A*)* • • • (X - X >, 
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then the invariant factors of dk + b, being factors of D, have the form 

«i = (X - Xi)'“(A - As ) "" ■ ■■ (X - A,) 1 ' 1 * 
a-a = (A — Ai) TO (A - ■ ■■ (A — A,)”* 


a t = (A - Aj)' , i.(A - A 2 )’.-« • (A - A.V-- 


an = (A - Ai) ’""(A - A 2 )'“* (A - A„)'»* 

n 

where vji = i a and, since a 3 is a factor of aj + 3 , 

y « i 

(8) v\ i < i>2i < • • < (?) == 1, 2, * • *, &*)•' 

Such of the factors (X — X/)’'o as are not constants, that is, those for which 
Vij > 0, are called the elementary divisors of a\ + b. The elementary divisors 
of X — b are also called the elementary divisors of b. When all the exponents 
Vij which are not 0 equal 1, b is said to have simple elementary divisors. 

For some purposes the degrees of the elementary divisors are of more impor- 
tance than the divisors themselves and, when this is the case, they are indi- 
cated by writing 

(9) 10, «i, r M _ h i, * • -, vn), (r n * 2 , Vn - i.s, * * *, J' 12 ), * * *1 

where exponents belonging to the same linear factor are in the same paren- 
thesis, zero exponents being omitted; (9) is sometimes called the characteristic 
of aX + b. If a root, say Xi, is zero, it is convenient to indicate this by writing 
v ( \ , in place of vn. 

r Ihe maximum degree of | a\ + b | is // and therefore y_j Vij < n where the 
C(|uality sign holds only when | a | ^ 0. »,y 

The modifications necessary when the homogeneous polynomial aX + bp 
is taken in place of a\ + b are obvious and are left- to the reader. 

3.06 Matrices with given elementary divisors. The direct investigation of 
the form of a matrix with given elementary divisors is somewhat tedious. 
It. can be carried out in a, variety of ways; but, since the form once found is 
easily verified, we shall here state this form and give the verification, merely 
saying in passing that it is suggested by the results of §2.07 together with a 
study of a matrix whose reduced characteristic function is (X — Xi)". 


Theorem 6. If Xi, Xo, ■ • *, X* are any constants, not necessarily all different 
and vi, 1 % • • *, p# are positive integers whose sum is n, and if ai is the array of Vi 
rows and columns given , by 

Xi 1 0 ■ • • 0 0 

0 Xi 1 • • • 0 0 


( 10 ) 


0 0 0 • ■ • Ai 1 

0 0 0 • ■ 0 A,- 
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where each coordinate on the mam diagonal equals A*, those on tht punt ( on its 
right are 1, and the remaining ones are 0, and if a is the maim of n i on s ant < o minis 
given by 


fll 


a i> 

a. 


composed of blocks of terms defined by (10) arranged so that the mam diagonal 
of each lies on the main diagonal of a, the other coordinates being 0, then A - a 
has the elementary divisors 

(12) (A - Ai )*, (A - A 2 )% • ■ (A - A> 

In addition to using a* to denote the block given in (10) wo shall also use it 
for the matrix having this block in the position indicated in (11) and zeros 
elsewhere. In the same way, if J% is a block with vi rows a/nd columns with 
Ts in the main diagonal and zeros elsewhere, we may also use/* lor (lie corre- 
sponding matrix of order n. We can then write 

A - a = 2(\fi - a<), fta = ai = afi, 2/* « I. 

The block of terms corresponding to A fi — ai has then the form 


(13) 



(in rows and columns) 


A - A t - 

where only the non-zero terms are indicated. The determinant of I hose r, 
rows and columns is (A — A »)"* and this determinant has a first minor equal 
to dbl; the invariant factors of A fi — a »•, regarded as a matrix of order /*,, 
are therefore 1,1, • * •, 1, (A — Ai)"i and hence it can be reduced by elementary 
transformation to the diagonal form 

(A - A { y< 

1 


1 . 

If we apply the same elementary transformations to the corresponding rows 
and columns of A — a, the effect is the same as regards the block of terms 
A fi - ai (corresponding to o< in (11)) since all the other coordinates in the rows 
and columns which contain elements of this block are 0; moreover these trans- 
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formations do not affect the remaining blocks X/ 3 - — a 3 (j i) nor any 0 coordi- 
nate. Carrying out this process for i — 1, 2, ■ ■ ■, s and permuting rows and 
columns, if necessary, we arrive at the form 

(X - Xi )» 

(X - A*)* 


(X - X.)-. 
1 


1. 

Suppose now that the notation is so arranged that 

Xj = X 2 = • • • = \j, = a, vy vi v 

but X,- t* a for i > p. The ?;.th determinantal factor D n then contains (X — a) 

V 

to the power ' % ^j exactly. Each minor of order n — 1 contains at least p — 1 

i 

of the factors 


(14) (X — a ) V{ , (X — «)*'*, • • • , (X — a) 

and in one the highest; power (X — a) v is lacking; hence D n _ j contains (X — a) 

i> 

to exactly the power 'VJ in and hence the //th invariant- factor a n contains it 

to exactly the /'ith power. Similarly the minors of order n — 2 each contain 
at least p — 2 of 1 he factors (11) and one lacks the two factors of highest degree; 


hence (X - a) is contained in I)„ .... « to exactly the power / .i>i and i 


m a„ .... i 


to the powrr /'•>. Continuing in this way we see that (14) gives the elementary 
divisors of a which are powers of (X — a) and, treating the other roots in the 
same way, we see that the complete list of element-ary divisors is given by (12) 
as required by the theorem. 


.4.07 If A is a matrix with the same elementary divisors as a, it, follows from 
Theorem 5 that there is a, matrix P such that A = PaP 1 and hence, if we choose 
in place of the fundamental basis (c i, o>, • • •, r„) the basis (Pc i, Pc*, • • •, Pa n ), 
it follows from Theorem (> of chapter .1 that (II) gives the form of A relative 
to the new basis. This form is called the canonical form of *4. It follows 
immediately from this that 


a 


k 

3 


(15) 


P~ 1 A k P = 
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where a\ is the block of terms derived by forming the kth power of <ii regarded 
as a matrix of order Vi. 

Since D n equals | X — a |, it is the characteristic function of a (or A ) and, 
since Z) n _i is the highest common factor of the first minors, it follows from 
Theorem 3 of chapter 2 that a n is the reduced characteristic function. 

If we add the/ 5 s together in groups each group consisting of all the/s I ha t 
correspond to the same value of X;, we get a set of idem potent matrices, say 
ipi, (p 2 , (p T , corresponding to the distinct roots of a, say <*»> *•*» <*/■• 
These are the principal idempotent elements of a; for (i) api = (fid, (ii) (V V* 
is nilpotent, (iii) 2<pi = 2/* = 1 and = 0 (i ^ j ) so that the conditions oi 
§2.11 are satisfied. 

When the same root a\ occurs in several elementary divisors, the corresponding 
f's are called 'partial idempotent elements of a; they are not unique as is seen 
immediately by taking a = 1. 

If a is one of the roots of A, the form of A — a is sometimes important. 
Suppose that Xi = X 2 = • • • = \ J} - a, h ^ a (i > p) and set 

~b % == di ““ Gtjf i, 

the corresponding array in the fth block of a — a (of. (10), (1 I)) being 


( 16 ) 


X ( G! 1 

\i — a 1 


X/ - «. 

In the case of the first p blocks X; — a = 0 and the corresponding h\, /;«, • • • , h t> 
are nilpotent, the index of bi being vi and, assuming r., * • • • ■ r ; , as 

before, (A — a) k has the form 


P-'(A - a) k P = 


hi 


bi 


or, when k > u l 


( 17 ) 


P~ l (A - a)*P = 


0 


K 
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Since none of the diagonal coordinates of b p + 1 , •••, b a are 0, the rank 

p s 

of (it — a) k } when k > v h is exactly n — Vj and the nullspace 

i p- j-i 

of (A — a) k is then the same as that of (A — a)*' 1 . Hence, if there exists a 
vector z such that (A — a) k z = 0 but ( A — a) k “ 1 z ^ 0, then (i) k < v h 
(ii) z lies in the nullspace of (it — a)" 1 . 

3.08 Invariant vectors. If A is a matrix with the elementary divisors 
given in the statement of Theorem 6, then X — A is equivalent to X — a and 
by Theorem 5 there is a non-singular matrix P such that A = PaP~ l . If we 
denote the unit vectors corresponding to the rows and columns of m in (10) 
by < and set 

/,« = jPCjU = 1,2, •••, vr, i = 1,2, • • •, s) 

J ( 0 (j < 1 or > or i < 1 or >s) 

then 

ar 'l = X/rJ, act = A*ci + e{, • • •, m;' = + <. _i 

and hence 

(19) Ar} = X,r; + .r) -i ( j = 1, 2, • • *, i = 1, 2, ■ • a). 

The vectors .r ' are called a set of invariant vectors* of A. 

r Hie matrix A can be expressed in terms of its invariant vectors as follows. 
\Ye have from ( 10) 

{XiCj + c'j ,).SVJ. = Cj$(\iCj A~ t'j -t i) 

j J 

and hence, if 

(20) y) = (/") VJ - (PPT'jr), 
then 

(21) -1 = (X,r' + .r } i)N// ' = + ?/y -i i) 

l, > i , > 

where it should be noted that, the //’s form a system reciprocal to the r’s and 
that each of these systems forms a, basis of the vector space since \ P\ 0. 
If we form the transverse of it, we have from (21) 

(22) a 1 = Ovzd + y) +»)#*,- 

» , ; 

If homogeneous coordinates are used so that vectors represent points, an invariant 
vector is usually called a pole. 
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so that the invariant vectors of A' are obtained by forming the system recip- 
rocal to the z’a and inverting the order in each group of vector* corresponding 
to a given elementary divisor; thus 


A'y? t = A'yit-i = A#*- 1 + Vh> ' * *> //:! * 


A matrix A and its transverse clearly have the same elementary divisors and 
are therefore similar. The matrix which transforms A into A can be given 
explicitly as follows. Let gr be the symmetric array 


0 0 0 1 
0 0 ••• 1 o 


0 1 ••• 0 0 

1 0 ••• 0 0 


rows and columns). 


It is easily seen that = a\qi and hence, if Q is the matrix 


ffl 


(?2 

< 1 * 


we have Qa = a'Q, and a short calculation gives A' = It 1 A II where It is (Jm 
symmetric matrix 

(23) R = PQ-'P' = PQP*. 


If the elementary divisors of A are simple, then Q ~ 1 and It /7". 

If the roots A* of the elementary divisors (12) are all different, (In* nullity 
of ( A — At) is 1, and hence x\ is unique to a scalar multiplier. Hul the remain- 
ing x) are not unique. In fact, if the xs denote one choice of I lu* invariant 
vectors, we may take in place of x) 

z) = k\x) + k\x) -i + * • • + k)x\ ( j — 1,2, ■ • • , 

where the k’s are any constant scalars subject to the condition k\ s o. Sup- 
pose now that Ai = A 2 = • • • = A ; , = a, A t ^ a (i > p) arid r, ; v, * * • • • 

as in §3.07. We shall say that z h z 2 , •••, z k is a chain 7 of invariant vectors 
belonging to the exponent k if 


( 04 ) Zi = (A ~ a) h % 0 (i = 1, 2, • • •, k) 

K } (A - a)*z k = 0. 

It is also convenient to set z { = 0 for i < 0 or > k. We have al ready seen 
that k < vi and that z k lies in the nullspace of (A - a)'' 1 ; and from (17) it, is 


7 We shall say that the chain is generated by z k . 
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seen that the nullspace of (A — ct)" 1 has the basis (xj; j = 1, 2, ■ • v it i 

1 , 2 , p). 

Since z k belongs to the nullspace of ( A — a)”', we may set 
(25) z k = 'y j y j s ax ) 

1 = 1 J=1 

and therefore by repeated application of (15) with X* = a 


(26) 


{A - a) r z k = 2 Up) -2. 


From this it follows that, in order that (A. — a) A 's A . = 0, only values of j which 
are less than or equal to k can actually occur in (25) and in order that 
(A — a) k ~ ^ 0 at least one f t - A . must be different from 0; hence 

Z/: “ S + ft, A- -l X l -1 + • • •) 

t 

^A' “1 7 G ik x k — 1 H~ ft, A- — 1*T /,' —2 H~ *) 


~~ / ' j f t'/c'r i . 

Finally, if we impose (he resl.riciion Unit s* does not belong to any chain per- 
taining to an exponent greater than /<*, it is necessary and sufficient that /j be 

one of the numbers /a, /s • ••, /q, and Hud. no value of i corresponding to an 
exponent greater than I: occur in (27). 

5.00 The actual determinat ion of the vectors :c) can be e.arricd out by the 
processes of §5.02 and §5.04 or alternatively as follows. Suppose that the 
first *i of the exponents Vi equal /q, the next. .*» equal //o, and so on, and finally 
the last .s\, equal n, r Let S )L he the nullspace of (.1 — a)" 1 and s J{| the nullspa.ee 
of (.4 — <,•)"' *; then S JL conta.ins If yjd is a. space complementary to 

W J in W|, t.hen for any vector x in SDfi we have (.4 — a) r .r =•-- 0 only when 

r : ‘ tii. Also, if j'\, x>, • * • , jc Mi is a basis of A)fj, the vectors 

(2<S) (A - ay.fi (r - 0, I, //, - I) 

are linearly independent; for, if 

/M— 1 

2 S ^ > ( /l ~ = °» 

'/■ = « i 

some fa being different from 0, then multiplying by (A — a) 1 " ~ 8 ~ 1 we have 


(A Cfc)” 1 1 7 %ia x i — 0? 
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which is only possible if every = 0 since x h x 2 , f° rm n^basis 

and for no other vector of 9)b is (A — a) ni “ 1 x = 0 . I he space defined by (2S) 
clearly lies in 9b ; we shall denote it by Si. If we set 9b = 9b + Si wheie Stu 
is complementary to Si in 9b, then 9b contains all vectors which aie inembois 
of sets belonging to the exponents n 2 , n 3 , ••• but not lying in s(ds with the 
exponent rii. 

We now set 9b = 91 2 + SDb where 9^ is the subspace of vectors x in 9i.* such 
that (A - a )^- 1 x = 0. As before the elements of ®b generate sots with 
exponent n 2 but are not members of sets with higher exponents; and by a repeti- 
tion of this process we can determine step by step the sets of invariant vectors 
corresponding to each exponent w,\ 



CHAPTER IV 


VECTOR POLYNOMIAIjS. SINGULAR MATRIC POLYNOMIALS 

4.01 Vector polynomials. If a matric polynomial in X is singular, the elements 
of its nullspace may depend on X. We are therefore led to consider vectors 
whose coordinates are polynomials in a scalar variable X; such a vector is called 
a vector polynomial. Any vector polynomial can be put in the form 

z(\) = 2 ()X W + Zi\ m ~ 1 + • * ’ + Zm 

where z 0 , Zi, •••, z vl are vectors whose coordinates are independent of X and, 
if zo 0, m is called the degree of z(\). In a linear set with a basis composed 
of vector polynomials we are usually only concerned with those vectors that 
have integral coordinates when expressed in terms of the basis and, when this 
is so, we shall call the set an integral set. In a basis of an integral set the 
degree of an element of maximum degree will be called the degree of the basis. 

In practice an integral set is often given in terms of a sequence of vectors 
which are not linearly independent and so do not form a basis. For the present 
therefore we shall say that the sequence of vector polynomials 

(1) Si(X),^>(X), s,(X) 

defines the integral set of all vectors of the form 2 </v(X,) 2 i(X) where f’s are scalar 
polynomials, and show later that this is really an integral set by finding for it 
an integral basis. The sequence (1) is said to have' rank r if | z^z^ ••• z^ | 
vanishes identically in X for all choices of .ns’s when s > r and is not identically 
0 for some choice of the Ts when s — r. 

The theory of integral sets can he expressed entirely in terms of malric 
polynomials, but if will make matters somewhat clearer not to do so at first. 
By analogy with matrices we define an elementary transformation of a sequence 
of vector polynomials as follows. An element ary transformat ion of the sequence 
(I ) is the operation of replacing if by a sequence z' u z'>, ■ • *, z[. where: 

Typo I: z\ = z L + *V,,, z'„ - z,„ (</ iA i), 

n * i t 

Type II: s'i = z'j = = 2./ ('/ ^ bih 

1 Y[)(' III. z jt — p /,z />, ( p == 1 , •**, A), 

wliere t he f’s are scalar polynomials and the p’s constants none of which is 0. 

The rank of a sequence is not altered by an elementary transformation, and 
two sequences connected by an elementary transformation are equivalent in 
the sense that every vector polynomial belonging to the integral set defined 
by the one also belongs to the integral set defined by the other. 

Two sequences which can be derived the one from the other by elementary 
transformations are said to be equivalent. The corresponding integral sets 
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may also be said to be equivalent; and if only transformations \\it\ cons .an .. 
coefficients are used, the equivalence is said to be strict. ^ Equivalence ma\ a so 
be defined as follows. If P is an elementary matrix which turns am \ < ( tor o 
the integral linear set (zi, z 2j • • Zk) into a vector ol the same s(‘t, t mi it is 
easily shown that this set is equivalent to (Pzi, Pz», * * *> nn( * y» 

we also say that the linear set (zi, z 2 , • • *, zi) is invariant under I* allhoug'i f le 
individual elements of the basis are not necessarily unchanged. Ii the ios( no- 
tion that P leaves (z h z 2 , •••, z k ) invariant is not imposed, the two sots are 
said to be similar. 

4.02 The degree invariants. We have seen in the previous section I ha t 
the sequence in terms of which an integral set is defined may bo transformed 
by elementary transformations without altering the integral sot itsell. We 
shall now show how we may choose a normalized basis and determine certain 
invariants connected with the set. Let the vectors (1), when written in lull, he 

( 2 ) Zi{\) = \ mi Zio + y >H “ l Zu + • * * + 

and suppose the notation so arranged that nh < m<> < •• • < ////.. Suppose 
further that the leading coefficients Zio, 220 , 4 •• i. n linearly inde- 

pendent but that 

4' — 1 

40 = j Vi 40; 

1 

the 77 ’s being constants not all 0; then m s > nii (i = 1, 2, • • • , a \) and 

«-i 

1 

is either 0 or has a lower degree than z d , and it may replace z„ in tin 1 sequence. 

After a finite number of elementary transformations of this kind we arrive 
at a sequence equivalent to (1) which consists of a number p of vector poly 
normals x if x 2 , • • x p in which the leading coefficients an 1 linearly independent 
followed by k — p zero-vectors. Now if we form | .tvy ••• .p, | using the 
notation of (2) with x's in place of z' s, the term of highest, degree is A'" 1 > • • • ! 

•| x 10 X 20 • * * Xpo !, which is not 0 since the leading vectors .n„, .r..„, • • •, .r , an* 
linearly independent. But the rank of a sequence is not changed by ele 
mentary transformations; hence p = r and we have the following theorem. 

Theorem 1. If Zi 7 z<>, •••, Z/ : is a sequence of vector polynomials of rout: r } 
the set of vectors of the form 2 the f’s being scalar polynomials , form 

i 

an integral set with a basis of order r which may be so chosen that flic leading o icljl - 
dents of its constituent vectors are linearly independent. 
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When a basis of an integral set satisfies the conditions of this theorem and 
its elements are arranged in order of ascending degree, we shall call it a nor- 
mal basis. 

Corollary. If a; i , .r«, ••• , x r is a normal basis with the degrees mi < m 2 < 
• • ■ < nir, and if £i, &>, • • •, £ r are scalar polynomials, then the degree of the 


vector polynomial x = 


£ ».r* (f* ^ 0) is not less than ■///«. 

i 


Theorem 2. If x h .ro, •••, .i* r As* a normal basis of an, integral set and 
mi < m 2 < • * • < m r the corresponding degrees, and if y i, ?y 2 , • • •, y r is any 
other basis with the degrees n-i < /a 2 < • • • < u r , then 

nii < ni, ah < // 2 , • * *, m,. < /Ay. 

Further, the exponents m { , nn>, • • *, //a,. arc /Ac same for all normal bases. 

Let s be the first integer for which n H < rn a so that ru < n, < m, for i < s. 
Since (.r,, .r 2 , x r ) is a basis, we may set 

2/i = 2/ (i = 1,2, ■■■, »). 

/> 

Here no value of // greater than a — 1 is admissible since (he degree 7Aj of ?/»- 
is less than ///,,. This would mean that the rank of ?/i, ?/ 2 , ••*, ?/« was less 
than .s\ winch is impossible since they form part of a basis. Hence ru v < w* 
for all values of s. 

If both bases arc normal, it follows immediately that ///,* < Ui and also 
7i i < nii, whence nn = that is, the scU of exponents m.\, w r is the 

same for all normal bases. We shall call these exponents the degree invariants 
of the integral set. 


4.03 Elementary sets. If Z\(\), s 2 (A), •••, z r (X) is a basis of an integral 
set., but not necessarily a normal basis, the r- vector | Z\Z<> • * • z r |, which we 
call the determinant of the basis, is not. identically 0 but may vanish for cer- 
tain values of A. If it. vanishes for A = Ai, then £i(Ai), s 2 (Ai), •**, s r (Ai) are 
linearly dependent, that is, there is a relation wfi*i(Ai) = 0; wo may assume 
fi ^ 0 without loss of generality. It follows that has a factor of the 

form (A — Ai) rt , a > .1, and hence 


*i(X) 


^jZi( A) 

(X - X x )« 


is integral; and, since f t ^ 0, every element of ( 21 (A), 2 2 (A), •••, z r (X)) is 
integrably expressible in terms of (z[(X), ^(X) , ••*, z r ( A)). Moreover, since 




Zr = 




(A - Ai) a 


I; 
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the determinant of the new basis is of lower degree than that of the old and so, 
if we continue this process, we shall arrive after a finite number oi steps at a 
basis (^(X), a 2 (X), • • *, s r (X)) whose elements are linearly independent for all 
values of X. A set which has a basis of this kind will be called an elemental y 
integral set; and it is readily shown that every basis of an elementary integral 
set has the given property, namely, that its elements are linearly independent 
for every value of X. These results are summarized as follows. 

Theorem 3. Every integral set of order r is contained in an elementary sd 
of the same order. 

We also have 

Theorem 4. Let x\ , Xo, • • *, x r be a basis of an elementary set. If r < n, three 
exists a complementary elementary basis x r + u * • * > such that | xuh * * * \ ^() 

for any value of X and this basis can be so chosen that its degree does not exceed 
that of x 1} rr 2 , * • *, x r . 

Theorem 5. If Xi, an, • • *, x r is a basis of an elementary sd, there exists an 
elementary matric polynomial X such that X{ = AV* (i = 1,2, • * - , r). 

For let y be a constant vector which for some value 1 of X is not linearly depen- 
dent on xi, Xi, - - - , x r so that we do not have identically y = ^rjiXi for any 
which are scalar polynomials. If for some value of X, say Xj, we have y == 
Sff^i(Xi), the f’s being constants, then y — 2^x*i(X) has the factor X — Xi and, 
as in the proof of Theorem 3, we can modify y step by step till wo arrive at a 
vector polynomial x r + 1 such that x h .r 2 , •**, x r , x r + i form an olemenlarv 
basis. The degree at each step of this process does not exceed Hint of I he 
original basis since only constant multipliers are used. This procedure may 
be continued till a basis of order n is reached, which proves Theorem I. 

The proof of Theorem 5 is immediate; in fact, using the basis derived in I he 

proof of Theorem 4, A" = ^ x$Ci satisfies the required eondil ions and 

i 

J X | = | Xi x >2 • • • x n I, which does not vanish for any value of X. 

As a converse to Theorem 5 we have that, if X is an elementary matrix, then 
Xi = Xei (i = 1 , 2 , * * •, r) is a basis of the elementary set (xi, x 2 , - • *, x, ). 

4.04 If zij z», • • - , Zic is a sequence of vector polynomials of rank r, wo may 
always assume k < n by merely increasing the order of the fundamenlal space, 
if necessary. Setting Zi = 2fy t -ey, let us consider the matric polynomial 


1 If the question of degree is not important, any vector polynomial safisfvimr H»iu 
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fu • • • Uk 0 • • • 0 

?21 • • • ilk 0 • • • 0 


A' 

= Z i Se i- 


ini ■ ■ ■ ink o • • • o 


Hie elementary transformations used in §4.02 in finding a basis of the integra 
set correspond when applied to Z to a combination of elementary transforma 
lions, as defined as §3.01, and because these transformations involve columns 
only, they correspond to multiplying Z on the right by an elementary poly- 
nomial Qi. Similarly, if 


Z = 'SCiSiji, yi = 


the process of finding a basis for yi, y 2l • • y n , whose rank is r, corresponds 
to multiplying Z on the left by an elementary polynomial P i. 

We shall now suppose that k = r so that Qi — 1; then P L Z has the form 




co u 

' * * Wi r 

0 

. . . o 



Cl? 2 l 

* * * Cx) 2r 

0 

. . . o 


il 

N: 

II 

0) r 1 

* • * GJ y r 

0 

. . . o 



0 

... 0 

0 

. . . o 



0 

... 0 

0 

. . . o 

We now bring Z 

i to the normal 

form 

of §3.02, t 

jay 



VZ X Q = ri\ZQ - 


.Ci 



±2 

fr 



0 

0 


whe r< k f,, £ 2 , ■ • <f r are the invariant factors of Z (or Zi) and in doing so only 
the first r rows and columns are involved so that 
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Therefore, if Xi — (PPi) -1 ^, we have successively 

PP l ZQe i = Uei, ZQ ei = U{PPi)' l ei = ft** (*' = 1, 2, • • •, r) 
and, if Q = || ||, 

= gnei + g 2i e 2 + • • • + (i = 1, 2, • • •, r) 

and hence 

(4) ftZi = ZQe; = qnZi + q 2i z 2 + * • * + g>«Sr, (z = 1, 2, • * ■, r). 

But from (3) and the fact that | Q | is a constant different from 0, it follows (hat 
the determinant of the coefficients in (4) is also a constant different from 0, 
and hence these equations can be solved for the z’s in terms of the Ps giving, say 

Zi = 'Erbji^jXj (i 1, 2, *, 0 

where the V s are scalar polynomials. 

Returning now to the case k > r, we see that, since we can pass to the east 1 
k = r by elementary transformations, the ft’s are still the invariant factors of 

k 

Z = ZiSei. They are therefore also invariants of the integral set iridepcnd- 

i 

ently of the basis chosen to represent it, and so we shall call them the invariant 
factors of the set. 

We can now state the following theorem. 

Theorem 6. If ft, ft, ••*, ft are the invariant factors of an integral set of 
vector polynomials, we can find a basis of the form 

ft^l; ft^2, ‘ ' ') 

where x 1} x 2 , • • •, x r define an elementary set. 

4.05 Linear elementary bases. We shall derive in this section a canonical 
form for a basis of an elementary linear set. If 

(5) z h z 2) • • • , Zr, Zi = Xi\ + y if 

is a basis of an elementary linear set, it is convenient, though not necessary, to 
associate with it the matrix 

r r 

(6) A\ — B = gSzi = ^ giS(x{\ + y { ) 

1 1 

where g h g 2 , g r is a sequence of linearly independent constant vectors. 
When this is done, it should be noted that multiplying A\ - B on the right 
by an elementary matrix P corresponds to replacing (5) by the similar sequence 
P'z h • • P'z r . Multiplying on the left by P has no immediate interprelation 
in terms of the sequence except when 

r 

p ffi = fj PaSi (i = 1, 2, • • •, r) 

J=1 
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in which case we can write 

r r r 

P(A\ - B) = '/’j 9iS 2 ) Vn z i = <JiSz>i 

i = 1 ]' = 1 i = 1 

arid the set (sq, • * *, s') is equivalent to ( 5 ); whenP is constant, the equiva- 
lence is strict. 

Instead of restricting ourselves to the matrix ( 6 ), we shall only assume to 
begin with that AX — B is a linear matric polynomial of rank r < n. The 
nullspace S J l of AX — B is then an elementary integral set, a normalized basis 
of which we shall take to be 


(7) = aw \ m i + au\ m i 1 + ■ * * + a>i mi (i = 1, 2, • • •, n — r). 

From (AX — B)a.i = 0 we have 


Aa i0 = 0 , Aan = Ba i() , * • *, 0 = Ba imi , 
or, if we set an = 0 for t < 0 or t > m n 
(8) Aa it = Pa*, 4 _ x (J = 0, 1, 2, • • •, mi + 1). 


We shall now show that the vectors an (i = 1 , 2, • • *, n — • r; j = 0, 1, 
• ■ - , mO are linearly independent. Assume that a *7 are linearly independent 
for (/ = 1 , 2 , • ■ p - 1 ;j = 0 , 1 , • ■ m) and (i = p;j = 0 , 1 , • • •, q - 1 ) 
but that 


(in 


a,,. 


<1 — 1 i> — 1 mi 

"b '^J j C-Ujdij ~ 0. 

7 " () t - 1 7 ® 0 


Let be the greatest value of;/ for which some <**7 7 ^ 0 and let s be the greater 
of A and <j. Jf we set 


<1 - 1 

, Np 

- x | t ~l / J O ]>]d IK j ■■■■ s ’1 ■ 

/»- 1 

, + 2 

2 

j 0 

/ - 1 

> - 0 

0, c H = 0 and 



7-1 

1 \ A 

c« - 1 a. p t ( j ... 1 j a pj 'dp, j 

/> — 1 

mi 

V 

j 1 

* - 1 

j„i 


which is not 0 by hypothesis, except perhaps when q — 0 and every an ( j 9 ^ 0) 
in (9) is 0, which, however, is not possible since by Theorem 1 the leading 
coefficients a i{) in (7) are linearly independent. Also from (8) it follows that 
Ac t ™ Be t ... 1 , and hence 


(ID 


c(X) = c.i)\* 1 + CjX ,s 2 + • • • + c 8 _ 1 
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is a null-vector of AX — B of degree less than m p . But every such integral 
null- vector is linearly dependent on ai, a 2 , , a P - 1 with integrj cot a 

cients, say 

(12) c(x) = ^ TiM«*-P 0 ; 

1 

and this gives 

p-i ^ 

- i = c(0) = ^ 7i(0)«i(0) = 7i(0)«£mi, 

i i 

which is impossible since c s _ 1 is obtained from (9) by lowering the second 
subscript in each term and no such subscript greater than m% can occur in any 
ay. Hence the ay are linearly independent. 

In order to simplify the notation we shall now set 

(13) an = Qe) (i = 1,2, ■ ■ ■, n - r; j = 0, 1, • • •, wi ; ) 

where Q is a constant non-singular matrix and e) are fundamental units rear- 

i — 1 

ranged by setting, say, e k = e) when k = 2 (m q + 1) + j + 1; as before 

5=1 

e) = 0 for j < 0 and j > m*. We shall denote the space defined by the c) 
by 93?] and the complementary space by Sfe; since the bases of 2)1 1 and 9Ji\> 
can be chosen as sequences of fundamental units, they are reciprocal as well as 
complementary. 

We return to the particular case in which A\ — B is given by (6). Corre- 
sponding to (12) we define a new set of vectors w by 

(14) wi = Q r Zi (i = 1, 2, •••,?*) 

and when this is done a normal basis of the nullspace 9b of (A\ — H)Q - 
XgiSwi is given by 

(15) bi = elV'i + ~ 1 + • • * + < (i = 1,2, * • *, //. - r). 

We have seen in §1.10 that 6i, b 2 , • •*, b n - r is the space reciprocal In 

7i — r 

Wi , iv 2 , • • •, w r . Now in 2Ki the ^ m» vectors 

i 

(16) /} = ej _ ! - Ac) (i = 1, 2, • • ra - r; j =1,2,* **, >///) 

are linearly independent; and they form the set reciprocal to (14) in 9)i\ since 

Sfjbp = 0 for all i, j , p and the sum of the orders of the two sets is 2///,* + 
(n — r) which is the order of 93?i. Hence the total set (w h w 2 , • • *, w r ) recip- 
rocal to (6], b 2 , * * - , b n _ r ) is composed of (15) together with 9)<V We shall 
call this form of basis a canonical basis of the set (13). We can now state the 
following theorems. 
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Theorem 7. A linear elementary set of order r has a basis of the form 

(17) 9 j (j 1, 2, *> ml), g j = a,i t j _ i “f~ Xoq - / (i = 1, 2, ' * ’> v j j — 1* 2, • • 

where the constant vectors g], are linearly independent for all j and i and the 
integers nii are those degree invariants of the reciprocal set that are riot 0. 

We shall call each set g) ( j — 1,2, • • *, mi) a chain of index m*, and define 
the integers m h m 2 , • • *, m v as the Kronecker invariants of the set. A basis 
of the form (17) will be called canonical. 

Theorem 8. Two linear elementary sets are similar if, and only if, they have 
the same Kronecker invariants and the same order. 

It should be noted that, if r is the order of the set, then 

(18) m + m i ~ r ) m + ( m i + 1) < n ) v < n ~ r. 

i i 

If r = n, all the Kronecker invariants are 0 and there are no chains in the basis. 

If Zi, z 2 , • * • , z r is a normal basis of an elementary linear set, the first m being 
constant and the rest linear in X, and g), g) is a canonical basis, the notation 
being that of Theorem 7, then clearly the set g) ( j = 1,2, • • •, m) is strictly 
equivalent to ( 21 , z 2 , z m ) and the remaining vectors have the form 

(10) g) = w) + u) + X^j 

where // ' and v) belong to ( 21 , z 2 , • • *, z M ) and the w) are constant, linear combina- 
tions of z,„ | i, • • •, z f . Since a canonical basis is also normal, 

(20) f/j (J = h 2, m), w) (i = 1,2, • *, v\j = 1, 2, mf) 

is a, normal basis strictly equivalent to (z 1 , z», • • •, z r ). Now (19) may be 
written 

(21) w} = g) ~ u) - X/»J = a it j „ 1 - u) + — v'j) = z) + b ) 

where b'j = v'j j — u j is a constant vector of the linear set, (r/J, f/S, * • *, g}],) and 

(22) z) = a y — 1 v) - 1 + X(u, : , - rj). 

Here (22) together with the f/j form a canonical basis which from (21) is strictly 
equivalent to (20) and therefore to (zi, z>, ••*, z r ). We ther( k fore have the 
following t heorem. 

Theorem 9. livery normal basis of a linear elementary set is strictly equivalent 
to some canonical basis. 

TOO Singular linear polynomials. Let AX + B be a matric polynomial 
of rank r < //. Its left and right grounds are linear integral sets of rank r, 
and by Theorems ?> and 7 we can find canonical bases in terms of which the 
vectors of the two grounds can be integrally expressed, say 
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respectively, where the first a z’s and pw’s are constant and the rest linear in X. 
When d.A + B is expressed in terms of these bases, then, remembering that no 
second degree term can appear, we see that it has the form 


r — a ft 


<x r - /* 


(24) 2 S + + + y)h:i I . /■ 


i = 1 3 = 1 

The row vectors 


£=1 7=1 


(25 ) pi — "h "T - ^ j & i , ft -f yWtf f / 0 h 2, * > 

i=l ^ 7=1 

(25 ) + i = "y 4- i, fU)j (i = 1,2, <0 


7=1 


form a set of r linearly independent vectors and, since the set (25") depends 
only on ft w’ s, we must have r — a < ft. Setting 7 = a + ft — r we may 
replace w y + 1 , , wp by p a + 1 , • • in (23) without destroying the canon- 

ical form of the basis. A similar change can be made independently in the 

a 

z basis by replacing z y + ,■ by ^ *<. t> + + / O' = 1, 2, • • r - p = a - y). 

1 

When we assume that these changes have been made to begin with, wo may 
take in place of (24) 

a ft r — a r ~~ ft 

(26) AX + B = 2 QiijX + kij)ZiSW] + ^ z a + i8w y + i + ^ z y 

1 = 1 7=1 1 = 1 / « 1 

Figure 1 shows schematically the effect of this change of basis. To begin with 
the coefficients in (24) may be arranged in a square array AH of side r; the 


7 ft r a -f 0 


B 

C 

D 


E 

F 

a 

II 

J 

K 

: 

: 

, 

L 

M 

N 

P 

Q 

R 


Fig. 1 
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first double sum corresponds to the rectangle AL, the second to JQ and the 
third to CM, and the rectangle LR contains only zeros. After the transforma- 
tion which leads to (26), the only change in the scheme is that in JQ the part 
JP is now zero and the square KQ has 1 in the main diagonal and zeros else- 
where, and CM also takes a similar form. 

If we set 

« 

Z a + i :=: Z <x -f i H” y j Qh\ 7 -|- A foj, y + i)Z-j (i — 1 , 2, * * *, ? Oi) 

3 = 1 
7 

ft -{- i ~ W (3 -}- i “f" j j X ~f~ ky -f*i, j)^/ (^ 1)2, '**,'/ / 5 ) 

y = 1 

then si, * • *, z, z[ x . v .j, • • •, and wi, • • •, w'p+i, • ■ *, w' r are still elementary 
bases of the right and left grounds, and in terms of them (26) becomes 

y r — a r — ft 

/* j (hij\ + kij)ZiSWj + Z 'a+i8Wy+i + /^/ Z y +S W ft +r 

i , y — 1 1 = 1 i — 1 

The number of terms in these summations after summing for j is 7 + (r — a) 
+ (?• — (3) — r. Hence the rank of the square array A» } -X + /c t -/ (q j = 1,2, 

* • ■ , 7 ) is 7 and by a change of variable of the form A — A : = A', if necessary, 

wo can secure that the array kij is also of rank 7 . 

The transformation just employed disturbs the canonical form of the basis 
and we have now to devise a different transformation which will avoid this. 
Let us set in place of w 1 , v/’o, • • *, vr T 

ft 

= IV, ~ AJ Pi. t-yVU (J = 1,2, • • 7) 

/ - T II 

where the //s arc constants to he determined later, and for brevity set also 


7 7_ 

^ j ~ / kijZiy kj ^ y kijZi 


U = 1,2, •••, d); 


t 1 


since the rank of /»*,:, (/ = 1 ,2, • • • , 7) is 7, the vectors /q, h, • • /.' 7 form a 

basis of (c 1 , So, • • -, 2 7 ). After this change of basis the part of the first double 
sum (cf. (26)) which corresponds to i — 1, 2, • **, 7; j = 7 + b * * *, £ is 
0 r t 1 


(27) 




/qA + /q + 


/* , ])(, 3 - y(k t\ + k I ) 




j 7-11 


/. - 1 


( lonsider now a single chain of z '\ s of index s which by a suitable change of 
notation we may suppose to be z a + 1 , z a + 2 , * * *, z<x + «; we shall seek to deter- 
mine the // s so that the corresponding part of (27) shall become 


7 + p 

2) (fir, - 1 - T + \g i -y)Sw J , 

3 =7+1 


( 28 ) 
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| IV | 


the g’s being vectors in the space (zj, z 2 , •••, z 7 ). Equaling corresponding 
terms in (27) and (28) we have 

ffo= k-r+i + 2/ = h y+i + ^ 12 + 

2 ; 


p s _l - A t + s _! + 2) Pt,.-lht = ft 7 + s + (I, ==//>!„ -|- ^ 

* < ( 

Choosing (t = 1, 2, •••, 7) arbitrarily we define <70 by the first equation; 
then the second defines p i2 since the vector fa + i + ^ Pnfa. — /•, , ... can In- 

expressed uniquely in terms of the basis (k h fa, ■■■, fa); and the remaining 
ps are similarly determined in succession, while the last equation defines </,. 
If we now in our basis put in place of z a + ,• 

%ct + i £a -f- 2 “f" (Ji — I "j“ \Qi (l ~ 1,2, * * * , vS') 


$ 

and combine the corresponding part of (27) with y > 


i ™ l 


iSV'y 


(ll<‘ (wo 


together give ^ s a + i Sw y + i and the new basis is still canonical. Wo then 

i = i 

treat all the z chains in the same way and have finally in plavo of (20) 


'sA a 0 r - - ,* 

2j (M + fa^ZiSw - + 2 V (A„x + fa,) Si Swj + , N„. , 

i ’ 3 ' = 1 i=rti el rt 

r-0 

+s S T + ySw’jj 4. y. 

7=1 

Ihe changes in the bases used above have replaced I, lie coordin.-iles h /■ 
by 0 tor the range i = 1,2, ri = ni i , , , . ” 

wholly unaltered for i = y + i a . 9 '! y ^ 

interchange the roles of the z’s and w ’ s and by nAifybA, J ' ■A:: 
can make these coordinates zero for the second range of subscripts will, oul 
altering the zeros already obtained for the first ran-m Hence it is , ill 

yjy ohdoe th » —w •»*» 


y 

A\ + B = 2 (hijk + fa^ZiS-wj + V y 




“h %ct + iSWy + { -(- 


/ / (A/yX + kijhjSwj 

i=?+l 
r~l 3 


Z 7 + jSil'p 4, j. 


7=1 


( 29 ) 
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Here the first summation can be reduced to a canonical form without affecting 
the rest of the expression; we therefore neglect it in the meantime and deal 
only with the remaining terms. This is equivalent to putting 7 = 0, and in 
making this change it is convenient to alter the notation so as to indicate the 
chains in the bases. 

As in §4.05 let the chains of the z and w bases be 


fi-i + vi 

u = 1, 2, • 

Co 

<s>. 

II 

yo 

•, Vl) 

0 S-i + w 

(s - 1, 2, • 

* *, tpj P 1 , 2 , 

■ ■, "2) 


respectively, and denote the constant vectors of the respective bases by 
and w) where i, j, p, q take the values indicated above since, when 7 = 0, we 

have sj ~ r — a — d, t q = r — (3 = a. We have then to determine 
1 1 

a canonical form for the matrix 


(30) >] (h p q )\ + k* ))z\Sw) + 2 (/;•-! + + 2 < s w 


+ ><) 


and in doing so we shall show that the first summation can be eliminated by a 
proper choice of the bases of the chains. 

It will simplify the notation if we consider first only two chains, one of index 
<s in the s-basis and the other of index t in the w-basis and, omitting the super- 
scripts, choose the notation so that these chains are f {) + A/i, •••,/,_ 1 + A/ a 

and f/u + Af/], ••• , (jt - \ + \g t . We now modify these by adding c ij _ 1 + A cij 

to Si 1 + A/, and hi „ 1 + A hi to r/ t - _ t + Ary* choosing 

t s 

(Ij = 2 ai ‘ Zi 0 = °J h ■■■, «), bi = 2 Pij w i (i = 0, 1, •••, t) 

i 1 j — l 


in such a way as to eliminate the corresponding terms in the first summation 
of (M0). To do t his we must choose the a ’ s and d’ s «o that 


(Ml ) ij = o'/y +/Ly, — (Vi, j - 1 + fti-i,] (l — 1,2, • * *, j — 1>2, * • -, 6*). 

For j > 1 this gives a»-, ; _ 1 = /u, i - 1 — tfi, y - 1 and hence if / t -y = /c t -/ — //,*, y _ 1 
wo may write 


(M2') 

(M2") 




= In 

- 


1, y 


Oio + di 
do y - 


(i =1,2, 


*, «)• 


If we give a»o (j = 1, 2, **-,0 arbitrary values, (32') defines do for i = 0, 
1, • • •, t — 1 and leaves dn arbitrary; then j = 2 in (32") gives d»2 for i = 
0,1, • • • , l — 1 and leaves fi t2 arbitrary, and so on: and when the d's are found 
in this way, certain of them being arbitrary, the first equation of (31) gives the 
remaining a’s. 
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Combining every z chain in this way with each w chain in tur n, \\i ina y 
eliminate all the terms in the quadruple sum in (30), and (29) may 1. 1< u ok , 
by a proper choice of the two bases, be replaced by 


(33) 

AX + B = "y] (Ai/X + hj)zSwj + ^ (/y — i + X/-)$w) + y J s !/*%,/ 


i Xf/:;) 


where no two of the linear sets 

(34') (z l9 z 2) • • •, zy), (/o, fl, * * ' j flu *i), 

(«i, 2?, • • *, 2 > * ‘ *> 

have any vector in common, and also no two of 

(34") (wij w 2) * * *, w y ), (w{, wl, *, wl i7 i = 1, 2, • • • , ), 

(£7o> ) 9% > p = 1, ‘ m m } V2 ) 


have any vector in common. 

We shall now for the moment suppose that the order n of the fundamental 
space is taken so large that we can introduce vectors %% (p = 1,2, * * *, v*) into 
the third set in (340 without causing the three spaces to overlap, and also 
wl (i = 1, 2, • • *, vi) into the second set of (34") . As a matter of convenience 
we can then find two constant non-singular matrices P, Q such I ha I 

Pzi = 6i = Q'w i} Pf } = ej = Q'w), Pz\ - <• - Q'i/ 1 : 

where the range of the affices is as in (34) and where 

i- 1 

e) — ej c , k = 7 + ^ (s« + 1 ) + j + 1 , cj ll; ' a, 

a = 1 

i 

A = 7 + 2 (Si + 1) + 2 (<« +!)+'/+! 


and, when this is done, 


(35) P(A\ + B)Q = yPj ( hijX + ki})eSe.j + (c) __i -j- X/-* J- ) *SV* * 

t» j=i *'»; 

+ + k 

p,q 

This matrix is composed of a number of blocks of terms arranged along the 
main diagonal, the remaining coordinates being 0. It must be candidly < ibserved 
however that, owing to the introduction of the vectors sg, wf, and to the fact, 
that a chain of index s depends on s + 1 constant vectors, the total number 
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of rows and columns employed is greater than the rank r by the total number 
of chains in the left and right grounds. 

The first summation in (35) gives a block || h^X + || of y rows and columns. 

Each chain in the second and third summation gives a block of the respec- 
tive forms 


(36) 


If 


0 

1 

0 

... o 

0 

0 

0 

0 • 

• 0 

0 

0 

X 

1 

... () 

0 

1 

X 

0 • 

• 0 

0 

0 

0 

X 

... 0 

0 

0 

1 

X • 

• 0 

0 


0 

0 

0 

... X 

1 

0 

0 

0 • 

• X 

0 

0 

0 

0 

... 0 

X 

0 

0 

0 • 

• 1 

X 


+ By in 


these forms show that we obtain the invariant factors of (JujX + kij^CiScj 

together with a, number of Es from the blocks of type (36), the number con- 
tributed by each being one less than the number of rows it contains, that is, 
the index of the corresponding chain. This gives the following theorem. 


Theorem 10. Two matric polynomial s AX + Bjjl and ('X + l)y arc dr idly 
equivalent if, and only if, they have the same invariant factors and their respective 
riqht and left grounds have the same Kroneeker invariants. 


Thai those conditions are necessary is obvious; that they are sufficient follows 
readily from the form (33) derived above. In the first place, since the Kro- 
noeker invariants are the same for both, the second and third summations in 
(33) have the same form for both and are therefore strictly equivalent by 
moans of transformations which do not change the terms in (he first, summation. 
Secondly, the first summation in both yields the same invariant factors since 
the number of l’s due to the remaining terms depends only on the number of 
chains, w hich is the same for both; hence these summations are strictly equiva- 
lent and, because of the linear independence of the constant, vectors involved, 
(ho equivalence is obtainable by transformations which do not affect, the 
remaining terms. 

When the first summation in (35) is in canonical form, we shall say that 
AX + B is in its canonical form. This is however not altogether satisfactory 
wince the space necessary for this form may be of greater order than n. If 
v is the greater of v \ and (33) shows that the minimum order of the enveloping 
space is y + LX + + v. A canonical form for this number of dimensions 

can be obtained as follows. Pair the blocks of the first and second types of 
(36) till all of one type are used up, taking the order of the constituents in, say, 
the order of (36); then in the composite block formed from such a pair discard 
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[IV] 


the first column and also the row which contains the first row of the second 
block. This gives a canonical form for such a pair, namely, 


1 

0 • 

•• 0 

0 

0 

0 • 

•• 0 

0 

X 

1 • 

•• 0 

0 

0 

0 • 

•• 0 

0 

0 

X ■ 

•• 0 

0 

0 

0 • 

•• 0 

0 

0 

0 • 

• • 1 

0 

0 

0 • 

•• 0 

0 

0 

0 * 

• X 

0 

0 

0 • 

• 0 

0 

0 

0 •• 

• 0 

1 

X 

0 

• 0 

0 

0 

0 •• 

■ 0 

0 

1 

X •• 

■ 0 

0 

0 

0 •• 

• 0 

0 

0 

0 •• 

• X 

0 

0 

0 •• 

■ 0 

0 

0 

0 •• 

• 1 

X. 


If the number of chains in the left and right grounds is not the same, (hero 
will of course be blocks of one of the types (36) left unpaired. 



CHAPTER V 


COMPOUND MATRICES 

o.Ol In chapter I it was found necessary to consider the adjoint of A which is a 
matrix whose coordinates are the first minors of | A |. We shall now consider 
a more general class of matrices, called compound matrices, whose coordinates 
sue minors of | A | of the rth order; before doing so, however, it is convenient 
to extend the definition of Sxy to apply to vectors of higher grade. 


5.02 The scalar product. Let Xi = y t = ^ Vii e, (i = 1, 2, •••) be 

arbitrary vectors, then, by equation (37) §1.11 we have 


( 1 ) 


v* 

T[X2 • • ■ X r I — zly I flit £2 i* 
( i ) 


£ri r || * ei r |, 


and hence it is natural to extend the notion of the scalar product by setting 

S * 

| ‘ * fri r || 7 ?li^ 2 £., ’ ’ * 7 } ri |. 

( i ) ' V 

We then have the following lemma which becomes the ordinary rule for multi- 
plying together two determinants when r = n. 


bUMMA 1. 

(' ,J ) ,s ' I Ti.ra • • • x r || yyj* ■ ■ ■ y r | = | Ax(jj t |. 

I 1 or, S’ | .ri.i'a • ■ • x r || • ■ • d r | = | ■ ■ ■ £ r ,- r | , lienee 

R | six* • • • || Vi e lt ■■■ c ir | = V nii , | $„•, • ■ • | 



again 


* s ' I • r ' x * ' ' ' Xr 1 1 VOteC h ' ‘ I = ^ vt >. <S | x x • • • x r || 2 /i c;, ■ ■ ■ e ir \ 

i « 

= 2 Vi i, | Szyji f 2i , • • • £ri r | 

= I SXiy } SXiJ/i fsi, • * ■ £ri r | . 

i he lemma follows easily by a repetition of this process. 

1 he Laplace expansion of a determinant can clearly be expressed as a scalar 
product. This is most easily done by introducing the notion of the comple- 
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I. V j 

ment of a vector relative to the fundamental basis. If iu * 2 , ’ ’ * ? a wquonee 
of distinct integers in natural order each less than or equal to u and / ,• i * • * , i n 
the remaining integers up to and including n, also arranged in natural order, 
the complement of ] ••• e» r | relatively to the fundamental basis is 

defined as 1 

(4) | e h e i2 ■ ■ ■ e ir \ c = (-1) si “ + r(r + 1)/2 1 * r , G V , , • ■ • W, | 
and the complement of | XiX$ * • • x r | by 

( 5 ) | X1X2 * * * ;r r |c = 2 / I • • * irir II Cifit * * ■ *'ir U 

(0 

which is a vector of grade n — r. 

Laplace’s expansion of a determinant in terms of minors of order r can now 
be expressed in the following form. 

Lemma 2. 

( 6 ) S | X1X2 • • * X r Ic| X r + lX r + 2 * * * X n | = | &J .&2 * * * £«« | - | SXi<'j | 

= S | Xi * • ‘ X n || Ci • • * e n | = (— l) f ° l “ r) >S | X1X2 * • ' X r || X r , i ' * * X H |,. 

Further as an immediate consequence of (5) we have 
Lemma 3. 

(7) S | XiXo • • • X r | v | 2/JZ/2 • • • Vr \o = S | XyXo • • ■ X r || ' ’ * //, (, 


5.03 Compound matrices. 

If A ~ 

'ij, then, as in ( 1 ) 

1 

AxiAx 2 • * * 

AXr 

1 = 2 * it,,. 

S rj r | | ‘1 r it 




(j) 


But Acj = 

2j st 

i 

> a si 

scond application of (1) gives 

| AxiAxo 

■ • • Ax r | = 

>7 

* V'*i t 

Z_J \ 

*Zi'Jr || ^ i\J\ * ’ * (f'/Y/r 


( i ) ( y ) 


But the determinants | * Ul ••• £, jV | arc the coordinates of ihe , -vector 
| • x r |; hence | -1 ,Ti • • • /lav | is a linear vector form in | .ri.r ; . • ■ • x r \ 

in the corresponding space of (“) dimensions. We denote this vector func- 
tion or matrix by C r (A) and write 

(8) | AxiAxs ■■■ Ax r | = C r (A) | a,a 2 • ■ • x r \. 

We shall call C,(A) the rth compound, of A. Important particular cases arc 

(80 C,(A) = A, C n (A) = \A\, 

1 The Grassmann notation cannot be conveniently nsod here since it eonlliets wit li the 
notation for a determinant. It is sometimes convenient to define the complement of 
I 6162 • • • e» | as 1 . 
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and, if k is a scalar, 

(8") C r (k) = k\ 

Theorem 1. 

(9) C r (AB) = C r (A)C r (B). 

For 

| ABxiABxz ■ ■ ■ ABx r | = CM) | B Xl Bx 2 ■ ■ ■ Bx r \ 

= Cr(A)Cr(B) I Xl X 2 ■■■ Xr\. 

( 'orullary. If | .1 | ^ 0, then 

(10) [C r (4)]- 1 = CM-'). 


Theorem 2. 


(11) [CM)] 1 = CM'). 

h’or B I XiXi ■ ■ ■ Xr I <V(.4) | 1/ i]J 2 ■ ■ ■ Ur | = | SXiAi/j | = | SA'xHJj | 

= B | vl'.r, ■ • • A'xr \\yi ■ ■ ■ y r | = B \ yi ■ • • y r \ CM') \ x\ • • • x r \. 

m 

Thkorkm 3. If A = ^ then 

.1 

(12) CM ) = v* | (Ul a h ■ ■ ■ a ir I B | b k b ; , ■ • • h r |. 

( i ) 


This theorem follows by direct substitution for A in the left-hand side of (8). 
If i»;ives a. second proof for Theorem 2. 

If r — >// ? (|2) consists of one term only, and this term is 0 unless m, is the 
rank of .1, a properly which might have been made (he basis of the definition 


of rank. 


In particular, if X 



V - 2) 1/iBci, fh<'ii CM) = 

I 


| <v*:! c, | S | .r,.r 2 ••• .iv |, <- r (Y) = | //i !h //,- | S | c,r 2 e r | so that 
(’AX)’) ■■■ I C|Co r r I S I .iVa II lh!h //,■■ { N | c,e L . ••• c r |. Hut. 

A')' ^ r,Xr ( //;>SV’y so that r,.(A'T) ~ | Xr///, || ••• e r | S | **,<•.» ••• e r |. 

> , i 

Comparing these two forms of (\.{XY) therefore gives another proof of the 
first. lemma, of §5.02. 

if we consider (he complement of | .l./vt.ro ••• A.r r \ we arrive at a new 
matrix t, Vr (.|) of order ('!) which is called the ;7// xupplcmcnUmj compound of A. 
From (7) and (12) we have 


(13) | .Lrpt.r, • ■ • X.r, 


’St A 

= 2j 1 (ti ' ' ' ‘ l« ,S ' I l>: < ' ■ ' bir |« [ -n ■ ■ ■ Xr \c 

i 

- CM) | ,ri.r 2 ■■ ■ Xr | c 


and derive immediately the following which are analogous to Theorems 1 and 2. 
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Theorem 4. 

(14) C'(AB) = C r (A)C r (B). 

Theorem 5. 

(15) [C r (A)Y = C*(A'). 

The following theorems give the connection between compounds and supple- 
mentary compounds and also compounds of compounds. 

Theorem 6. 

(16) CWCn _ r (A) = \A I = C n “ r (A)C r (A f ). 

This is the Laplace expansion of the determinant | A |. Using equation ((>) 
and setting | e | for |. * e n \ we have 

| A | S | X 1 X 2 * * * X r | c | X r + 1 ‘ * X n | = | A | S | * * * Xn | | V | 

= S | Ax 1 ■ • • Ax n \\e\ 

= S | Axi • • * Ax r |c | Ax r + 1 * • • Ax n I 

>S(7 r (^4) | Xi * X r | cO n — r(T) j Xr 4 - 1 * n | 

= S | Xi * • • X r | c C r (A')Cn - r(4) | X r + 1 * * * *n | 

and, since the x’s are arbitrary, the first part of the theorem follows. The 
second part is proved in a similar fashion. 

Putting r = n — 1 in (16) gives the following corollary. 

Corollary. adj A = C n ~ l (T'). 

Theorem 7. 

( 17 ) | Cr(A) 1 = 1 A | ( " :1) = I C"(,l) |. 

For from (16) with A' in place of A, and from the fact that (lie order of 
C r (A) is ("), we have 

1 4 1 ( " } = I c*(A)Cn - r(A') I = I C'(A) 1 1 r.'„ _ r (,n I 

and, since | A | is irreducible when the coordinates of A are arbitrary variables, 
it follows that | C r (A) | is a power of | A |. Considerations or degree I lien show 
that the theorem is true when the coordinates are variables and, since the 
identity is integral, it follows that it is also true for any particular values of 
these variables. 


Theorem 8. 


(18) 

IaC ^ C s {Cr{A)) = 

\c” - r (A)) 

(19) 

U | (V) C S (C>(4)) = 

M|T" )_s (c n _ r (4)), 


s 
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Using (15), (16) and (17) we get 

C.(C* ~ r (A'))C^~\C n ~ '{A)) = | C” ~ *(A) [ = | A | (n;1) 

therefore 

| A |(* ^ C s (Cr(A)) = C,(C r (A))C,(C tt - r(A'))C* - *(C» - '(4)) 

= C.(CrU)C» - '(A'))C* - s {G n - ’•(4)) 

= Cs ( \A I x> - *«7» - r (A)) 

= | A |*C r « - *(C n ~ r (A)). 

Similarly 

C.(Cn - r (A'))C.(C r (A)) = C.( | A I ) = \A I s 

and therefore 

| A l-C®- (Cn - r(A)) = (U„ _ r (4))<7 s ((7 n _ r(A'))C.(C'(A)) 

= I C» _ ,(A) I U 5 (C r (/l)) 

= | AL F C«(C'(A)). 

An important particular case is C,(C n ~ l (A)) = | A |* ~ { C n ~ 8 (A ) whence 
(20) (7 lS (adj A) = C.CC" “ l (A')) = I A | # " l C* ~ *(A'). 


5.04 Roots of compound matrices. If A has simple elementary divisors 

and its roots are Xi, Xo, • X„, the corresponding invariant vectors being 

(i\ r <h, • • *, o n , (hen the roots of C r (A) are the products Xi,Xt 2 • • • X ir in which 
no two subscripts are the same and the subscripts are arranged in, say, numerical 
order; and the invariant vector corresponding to X;,X/.. • • • X,,. is | (u^u * * * d iy |. 
For there are (") distinct vectors of this type and 

< -r(A) | di/li, ■ * * OL ir | = | Adi, ‘ • * A(li r I = XtjX/.. * • • X ir | (l ix (l k * • * Uir |. 

Similarly for (7 r (/l) the roots and invariants are X,:,X t - ••• X ir and 

| a,///., ••• a ir \ r . 

It follows from considerations of continuity that the roots are as given above 
even when the elementary divisors are not simple. 

n 

5.05 Bordered determinants. Let A = ||a t -;|| — ^ cijScj, a } — 

i - l i 

be any matrix and associate with it two sets of vectors 

n 

A .* Xi — ^ y £ ijCj) 

7-i 

n 

Y: yi = 2 Vifr. 

? - i 


(i = 1, 2, r) 
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Consider the bordered determinant 


&11 

an • 

<hn 

& 

* * * frl 

0>21 

<^22 

&2 n 

fl2 

fr2 


Uni 

05 n 2 * * 

&nn 

fin 

* * * frn 

Vl 3 

V 12 

* rjin 

0 

... o 

Vn 

Vr2 * * 

Vrn 

0 

... 0 


A X 

1" (),. I 


where r < n, and 0 r is a square block of 0’s with r rows and columns. 

If w r e introduce r additional fundamental units c„ + 1; • • •, c n A,. ean be 
regarded as the determinant of a matrix 31 of order n + r, namely, 

n r r r 

^ = Zj a i& e i + XiSe n + i + ^ e n + iSiji = yj c-iSttj. 

1 1 1 I 

If now we form | 31 1 = S \ e \ C n + r ( 31) | e | as in §5.03, we have 
C„ + r (31) = "• c <» + , I S I dii ••• d ini . r | (i = 1,2, „ -I 2r|. 

/ „• 'i 


In this form any [ a, ■ ■ • <q )ltr ] which contains more Hum n out of a u ■ • • , ,t. 
x ” '\'> Xr is necessarily 0; also, if it does not contain all Mm Fs, the corre- 
sponding | di„ ■■■, d intr \ will contain more than n out of <•,, ■■■,<„ 
2/ij ‘ ■ ■> Ur and is consequently 0. We therefore have 


Cn + r(3[) - 2)' 1 | ttifii, ■ ■ ■ a in _ r z&2 ■ • • ;i\r„ ■ e„ , | 

(O 

X S I e-ifii, ■ ■ • a n _ r y,y 2 ■ ■ ■ y r e n + , ... r , l + ,.\ (/! = 1,2, ■ • ■ , „) 
and hence, passing back to space of n dimensions, 

|2( 1 SMK-, ••• ••• a; r J *Sf | e it ... e inr!h ... , jr || ,• I 

= S I Xl " ‘ Xr II ‘ ‘ ' a ‘«-r \JS | e tl ■■■ C inr | c | //, • • • //,. | 

~ ^ ! Xl ' ' ' Xr \ C n ~ r (A) I 2/i ... y r |. 

This relation shows why the bordered determinant is frequently used in 
place of the corresponding compound in dealing with the theory of forms. 

5 06 The reduction of bilinear forms. The Lagrange method of redueiiw 

quadratm and bilinear forms to a normal form is, as we shall now see closely 
connected with compounds. ; 
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If A is any matrix, not identically 0, there exist vectors xi, y x such that 
SXiAiji 9 ^ 0; then, setting A = Ai for convenience, the matrix 


A 2 = A x — A x y i 


SA[x i 


' SxiAiyi 

lias its rank exactly 1 less than that of A. For, if A x z = 0, then 

a „ a „ a n, ‘Z j i Sx]A x z „ 

As* Ai* A]?/! - — AjS — Ai?/x ~ — 7 — - = 0 


SxiAi?/! 

and, conversely if A»z = 0, then 


(S’xxAxyi 


A * = Jff = Ml2/l ’ 

sa,y, or d l(s — /<7/i) — 0. The null-space of A% is therefore obtained from that 
ol A i by adding y i to its basis, which increases the order of this space by 1 
since /li?y, ^ 0. 

11 A 2 9 ^ 0, this process may be repeated, that is, there exist x 2j y 2 such that 
Sr 2 A 2 y 2 ^ 0 and the rank of 

A A A SAix* 

A-s = A 2 - A%y% - — f — 

Sx 2 A 2 y 2 

is 1 less than that of A*. If the rank of A is r, we may continue this process 
by so Mina* 


, < S A[x, 

i = d,. - A - r — — 
oavl.s//.. 


(22) d, , , = d,. - d,7/, (s = 1,2, • • r) 

where Mr.,d 7^ 0 ami d L = d, d + 1 = 0; we then have 

(23) 


A - S A ,;„. M A . V «, 

“ /S.tvl ' 


where' s d , = d J is a matrix of rank 1. Generally speaking, one may 

lake x a = ?y, and it is of some interest to determine when this is not possible. 
If SxBx = 0 for every x, we readily see that B is skew. For then *S Icjia 
~ ScjBc.j — S(ci + (’j)B(ci + (>j) = 0 and therefore 

0 = S(('i T ~b ( ' A ~ BciBt'i + ScjBcj + SciBcj ~f~ S U'JU'i, 

Mud- is, BriBcj = —ScjBci, and hence B f = —B. Hence we may take x 8 = y 3 
so long as A s 9 ^ — A [. 


5.07 We shall now derive more explicit forms for the terms in (23) and show 
how they lead to the Sylvester- Francke theorems on compound determinants. 
Let x l , x 2 , * * *, x r , y l , ]f, • • *, y r be variable vectors and set 

(24) J = $ | :r,rM* 2 ■ • • s' | C r + l (A,) | y s yhf y r \ 

= S | XtX l x* • • • x* | | A s y s A s y l • • • A s y r | ; 
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then from (22) 

J = S | XsX 1 • • • x r || A s y s As + il/ 1 • • • A > + I 
= \Sx s A s y s Sx 1 A s + iy l ■■■ Sx r A, + xy r \. 

If the x’s denote rows in this determinant, the first row is 
Sx s A s y s , Sx s A s + i2/S SxAs + iy r 

each term of which is 0 except the first, since x„ lies in the null-space of .1 , , ,, 
and Sx s A s y s j 6 - 0. Hence 

(25) J = Sx s A 3 ys I Sx l A a +l y l ■■■ Sx'A. + 1 y r | 
and therefore from (24) 

(26) S I XsX 1 • • • x r I Cr + i(As) I y s y l • • • V r I 

= Sx s A s y s S [ x l • • • x r | C r (A s + 0 \y l • * • y r |. 

Repeated application of this relation gives 

(27) S | X S X S 4. j * * X 8 + t — ]X 1 X U * * • X r \Cr + 2 ( 4 .$) J Vs * IL f / 1 // * V I 

= Sx s A s y s Sx s + lAs 4- iy a + i • ■ ■ Sx s + * - iA s + t - 1 y„ + < - | x x • • • x r | 

• C r (4 s + f ) I?/ 1 ••• y r I, 
a particular case of which is 
(27') $ | xix 2 • • • re, _ is | C,(4L) | yi • • • - 1 y | 

= SxiAiy! • • • Sx t ~ « - 1 y H ~ iN.r/1 

To simplify these and similar formulae we shall now use a single ledor to 
indicate a sequence of vectors; thus we shall set AO, H , ... , for 
z* + t - 1 and Y r for y 1 ?/ 2 ♦ • • y r ; also C r , , for G r r (4,). Equations (2(5) and (27) 
may then be written 

(26a) S | x s X | C r + 1, a | y s Y r | = Sx„A a y a S | X r | C r , « 4 1 | Y' |, 

A -t t - I 

(27a) S | X s> s + t- iA r | C r + t,s | Y Sf s + i _]l /r | = SxiAiy^ | X r | C Ti ... } 1 1 }’ r |. 

i = a 

We get a more convenient form for (26a), namely 

(28) S I Xs,tX r | Cr+i-s+i.j | Y,, t Y r | 

= Sx s A s y a S | X t+1 , ,X r | Cr+ t - s , , +1 |7 af , ,F | 

by replacing r by r + t — s and then changing x l x- ■■■ x r 1 “ into 
^ + 1 ••• XtX 1 ■■■ x r along with a similar change in the y/'s. Putting s 
= 1, 2, ■ • •, t in succession and forming the product of corresponding sides of 
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the equations so obtained from (28) we get after canceling the common factors, 
which are not identically 0 provided that r + t is not greater than the rank of A, 

t 

(29) 5 | XtX' | C r + «, 1 1 Y t Y r | = Y[ SxtAwS | ] C r , « + i | Y ' |, 

1 

or from (27') 

(30) S | Xtf \C r +t \ Y t Y r | = 8 | X t | C t | Y t \ S \ X r | C T , t + 1 \ Y r | 
which may also be written in the form 


( 30 ') 

in particular 


_ S\X t X'\C r+t \ Y t Y'\ 
8 | X, | C t \Y t \ 


| Sx'A t + | ; 


(31) 


S | XtX | C/ + i(A) | 7,2/1 
5|Xi|Ci(A)| Y t \ 


SxA t + i y. 


This gives a definition of A t + 1 which may be used in place of ( 22 ) ; it shows that 
this matrix depends on 2 1 vector parameters. It is more convenient for some 
purposes to use the matrix A w defined by 


(32) SxAMy = S \ X t x \ C t + ,(T) | Y <y \ . 


From (31) we then have iS'.rvl , + ,y‘ — Sx { A (t, y’/S \ X t | C t | Y t | and there- 
fore from (300 


,SV.-1 "' ? />| _ S\X r | C r (A^) | Y r \ 


|N|A',|r,| Fill 1 - f*S* | AO | f,’i(A) | Y t | ] r 


(33) A' 

Hence 

N I V r I C ( 4 o') I Y r I 

(34) .3 I XX’ I ft. , ,(.t) I )',)•' I - P 


which is readily recognized as Sylvester’s theorem if the x’x are replaced by 
fundamental units and the integral form of (33) is used. 


5. OS Invariant factors. We shall now apply the above results in deriving 
the normal form of §3.02. We require first, however, the following lemma. 

Lemma 4. If A (A) is a rnatric polynomial , there exists a constant vector y and a 
vector polynomial x such that SxAy is the highest common factor of the coordi- 
nates of A. 


Let y = XrjiCi be a vector whose coordinates are variables independent of X. 
Let on be the H. C. F. of the coordinates of A = II || and set 


A = o:i B, By = Z?? fiije, = 20 
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There is no value At of A independent of the V s for which every ft -■ 0, lor if 
this were so, A — Ai would be a factor of each bi 3 and could not then be the 
H. C. F. of the ai V Hence the resultant of ft, ft, • • ■, ft as polynomials hi A 
is not identically 0 as a polynomial in the Vs; there are therefore values of (Jie 
Vs for which this resultant is not 0, and for these values the ft s have no I actor 
common to all. There then exist scalar polynomials * su< ^ 1 ^h;tt 

ft — 1 and therefore, if x = E&e*, we have /SVfty = 1 or SxAy — <u. 
Returning now to the form of A given in §5.06, namely 

Y SX'A'V' ’ 

we can as above choose x S) y s in such a manner that Sx*A H y H = is (he highest 
common factor of the coordinates of A a and, when this is done, 
u 8 = A'tXa/at are integral in A. We then have 

(35) A = ft = Sa.A*/ s . 

Moreover . 44 /; = 0 = A[xi when i < s and therefore in 

s | x, ■ • • Xr | | Anj v A 2 y 2 ■ ■ ■ A,y r | = | SxiAfflj | = | *L1 '.r,//, | 

all terms on one side of the main diagonal are 0 so that it reduces to ,s\rj.ii//i 
* • -Sx r A r y r = aiaz **• a r . Hence, dividing by at ••• a r and replacing 
Ajt/j/aj by v j as above, we see that | X] • • • x r | and | v\ • • • v r | an* not. 0 for 
any value of A, and therefore the constituent vectors in each set; remain linearly 
independent for all values of A. It follows in the same way that (ho sots 
U{, u r and y iy ••*, y rj respectively, are also linearly independent for all 
values of A, that is, these four sets are elementary sets. It follows from Theo- 
rem 5 §4.03, that we can find elementary polynomials P and Q such that 

Pvt = a = Q'ui (i = 1, 2, • • *, r), 

and hence 

(36) FAQ = P a s v s Su)j Q = g ‘WSV.„ 
which is the normal form of §3.02. 

5.09 Vector products. Let x : = (i = 1, 2, • r) ho a. sol of arbi- 
trary vectors and consider the set of all products of the form •• • 

arranged in some definite order. These products may then be regarded as the 
coordinates of a hypernumber 2 3 of order n r which we shall call the I amor product* 

2 The term ‘hypernumber’ is used in place of vector, us defined in §1.01 since we now 
wish to use the term Vector’ in a more restricted sense. 

3 This product was called by Grassmann. the general or indeterminate product. 
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of x h x 2 , • • •, x r and we shall denote it by X]X 2 • * • x r . In particular if we 

take all the products e^ei, ••• e ir {i h i 2 , *•*, ^ = 1, 2, •• •, w) each has all 

its coordinates zero except one, which has the value 1, and no two are equal. 
Further 

.Ti.t'2 • * * X r = i^2i s * * * ' * * £i r * 

If we regard the products * * * e ir as the basis of the set of hypernumbers, 

we are naturally led to consider sums of the type 


w — . . . i r Gi y Gi„ * • • ei r 

where the cTs are scalars; and we shall call such a hypernumber a tensor of 
grade r. It is readily seen that the product X\X 2 • • • x r is distributive and 
homogeneous with regard to each of its factors, that is, 

*Ti(X.r 2 + • • • x r = X^ja ’2 * • * x r + ^2/2^3 • * * x r - 

r r he product of two tensors of grade r and s is then defined by assuming the 
distributive law and setting 

(<V<> ••• e-j.) « «<, e ir e h ■■■ e u . 

It is easily shown that the product so defined is associative; it is however not 
commutative as is seen from the example 


•Ti.ro - .ro.n 


vv/t t 

- -- Asliif 2 0 


4 i / ms2 i, ) e i/ 1 i.. 



( i) 


si a 


fa/. 


fii. 


(c-A 


c i/i.)- 


Here the coefficients of — v (d < / 2 ) are the coordinates of | .rj.r 2 | 
so that (his tensor might have been defined in terms of the tensor product by 
setting 

| Tj.ro | = XiXo - x 2 x i . 


In the same way, if we form t ho expression 1 


/(•r 1,3-2, • Xr) = 22sgn(h, ? 2 , •**, irUuXit ••• x if , 


and expand if in terms of the coordinates of the t’s and the fundamental units, 
if is readily shown that the result is 
: 1 ; 

/ J | Si ' ’ ■ %ri r I * > ^i r )- 

(i) 

4 The determinant of a square array of vectors xij (/, j = 1, 2, ■ • • , r) may be defined as 
| xu | = 2sgn(d, i h *, ir)xii 1 a; 2 is *»>. 

In this definition th<^ row marks are kept in normal order and the column marks permuted; 
a different expression is obtained if the roles of the row and column marks are inter- 
changed but, as these determinants seem to have little intrinsic interest, it is not worth 
while to develop a notation for the numerous variants of the definition given above. 
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Here the scalar multipliers are the same as the coordinates of | .Wa ' * * x v | 
and hence the definition of §1.11 may now be replaced by 

| X X X 2 • • • Xr I = 2sgnfe, *2, •; QXi.Xi, ' * * X <’rJ 

which justifies the notation used. We then have 

(37) I XiX 2 - ‘ - Xr I = 2 I ' ' ' % ri r I 1 ‘ * * ( ' l r I * 

(0 

It is easily seen that the tensors | ei r \ are linearly independent, and 

(37) therefore shows that they form a basis for the set of vectors of grade r. 
Any expression of the form 

S&t, / r 1 dfih * ‘ ’ e <r I 

is called a vector of grade r and a vector of the form (37) is calk'd a. pure vector 
of grade r. 

5.10 The direct product. If A; = || a { p l) q || (i = 1, 2, •••, r) is a sequence 
of matrices of order n, then 


(38) AjXtAtXt ■ ■ ■ A,x r = 2) “Vj-, j. 


■ ■ fr/,< 


= 3f (.rj.ro 


X r ) 


where 21 is a linear homogeneous tensor function of ;r..r 2 • • * x n that- is, a mat rix 
in space of n r dimensions. This matrix is called the din'd product of 
Ai, A 2) • • A r and is denoted by A\ X A 2 X • • • X A r . Obviously 

(39) A& X A 2 B 2 X * • • - (Ay X A 2 X • • •)(« i X « a X • ■ ), 


and the form of (38) shows that 

(40) (A. X A, X •■•)' = Ai X A o X • • • . 

From (39) we have, on putting r - 1 for convenience', 

A x X A 2 X A 3 = (Ai X 1 X 1)(1 X A a X 0(1 X 1 X A,). 

Making A» = 1 (i = 2, 3, • • *, r) in (38) we have 

A ] .2 i.3 o ’> X? — XI y ‘ ' ’ * ' * C, r 

and hence the determinant of the corresponding matrix equals |d,|"' 
Treating the other factors in the same way we then see that 


(41) | Aa X A, X • * • X A r | = | A 1 A 2 • • • A r I"" '. 

Again if as in §5.04 we take x x as an invariant vector of A b x 2 a,s an invariant 
vector of A 2 , and so on, and denote the roots of Ai by A,,-, we see that th(' roots 

This definition may be generalized by taking Xi, x< 2 , • • • as vectors in different spaces 
of possibly different orders. See also §7.03. 



[5.11] 


INDUCED OR POWER MATRICES 


75 


of A 1 X As X * • ■ X Ar are the various products X iyi X* /a * • • X r/r . When the roots 
of each matrix are distinct, this gives equation (41) and, since this is an integral 
relation among the coefficients of the A’s, it follows that it is true in general. 

An important particular case arises when each of the matrices in (38) equals 
the same matrix A; the resultant matrix is denoted by n r (A), that is 

(4^) n r (A) = A X A X * * • (r factors). 

It is sometimes called the product transformation . Relations (39), (40), and 
(41) then become 


(43) IT r {AB) = II r (A)n r (R), n ,(A)' = U r (A'), | U r (A) \ = | A \ rnr ~\ 

o.ll Induced or power matrices. If .rj, x 2i •* *, x r are arbitrary vectors, 
the symmetric expression obtained by forming their products in every possible 

order and adding is called a permanent . It is usually denoted by | X\X 2 • • • x r | 
but it will be more convenient here to denote it by \xxx% • • • x r \ ; and similarly, 
if on] is a square array of scalars, we shall denote by { ol\\ol 22 ■ * * cxrr} the func- 
tion ^a\,: l a 2 u • * • a r i r in which the summation stretches over every permuta- 
tion of 1, 2, • • -, r. 

If some of the x s are equal, the terms of {.r^ra • • • x r } become equal in sets 
e t ich of which Inis the same number of terms. If the .r \s fall into s groups of 
***? % members, respectively, the members in each group being equal 
to one another, then 


1ms integral coefficients. For the present, we shall denote this expression by 
*'* but sometimes it will be more convenient to use the more 

explicit notation 


in which o of the .r’s equal .n, i<> equal .r>, etc.; this notation is, in fact, that; 
already used in §2.08, for instance, 

!* •'* y\ = 2 xhj + 2 xyx + 2 yx 2 

= x 2 y + xyx +- yx 2 = l { xxy j . 

The same convention applies immediately to { a n a 22 a rr \. 

In the notation just explained we have 

(44) {j- 1 -r* ••• Xr) = 2" ••• e tr ) 

where the summation S" extends over all combinations i,i, ■ ■ • i r of the first 
n integers repetition being allowed. This shows that the set of all permanents 
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of grade r has the basis {e^e^ ••• e» r } of order (n + 1 l)-. 

From (44) we readily derive 

(45) {AxiAxz ■■■ Ax r \ = ^ *{ ti/i M r t 1 ' '■ l 'r> 

i , j 

which is a linear tensor form in {x &2 • • * x r }- We may therefore set 

(46) {AXiAXi *•* Ax r ) = Pr(A){XiX'l •** - r rl 9 

where P r (A) is a matrix of order (n + r — l)!/r!(n — 1 ) • ^oidinafes 

are the polynomials in the coordinates of A which are given in (‘In); f ins maf rix 
is called the rth induced or power matrix of A. As with ( r(A) and Ih(*l,) if. 
follows that 


(47) 


Pr(AB) = Pr(A)Pr(B ), 

I Pr(A) | = I A I ( 


Pr(A)' = P r (d'i, 

»+ r — L \ 
r - 1 ) . 


also the roots of P r (A) are the various products of the form Xi 'Xu * ■ ■ A, 
for which 2a t * — r. 


5.12 Associated matrices. The matrices considered in the preceding seo 
tions have certain common properties; the coordinates of each are functions of 
the variable matrix A and, if T{A) stands for any one of them, then 

(48) T(AB) = T(A)T(B). 

Following Schur, who first treated the general problem of determining all such 
matrices, we shall call any matrix with these properties an assonulcrf matrix. 
If S is any constant matrix in the same space as T(A ), then 7 7 j (. I ) - N7 T (.l )S 1 
is clearly also an associated matrix; associated matrices related in this manner 
are said to be equivalent . 

Let the orders of A and T(A) be n and m respectively and denot(‘ the corre- 
sponding identity matrices by l n and 1 m ; then from (1 <S ) 

(49) T\ l n ) = T( l n ), T(l n )T(A) = T(A) = T(A)T(\»). 


If s is the rank of T(l 71 ), we can find a matrix $ which transforms 7 T ( I „ ) into a, 
diagonal matrix with s Vs in the main diagonal and zeros elsewhere; and we 
may without real loss of generality assume that T( l.,J has this form to start 
with, and write 


T(l ») = 


1. o 

0 0 


The second equation of (49) then shows that T(A) has the form 


T(A) = 


T S (A) 0 
0 0 
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and we shall therefore assume that s = m so that T(l n ) = l m . It follows 
from this that | T(A) | ^ 0 so that T(A ) is not singular for every we shall 
then say that l 1 is non-singular. 

A non-singular associated matrix T{A) is reducible (cf. §3.10) if it can be 
expressed in the form T(A) = T,(A) + A(A) where, if E, = TV l„), E 2 = 
A On), so that Ei + E 2 = 1,„, then 

A (A) = E 1 T(A)E 1 , T,(A) = E«T(A)E 2 

E l T(A)E a = 0 == E 2 T(A)E 1 

so that 


A 1 ? = E h E\ = E 2 
EiE« = 0 = E 2 E 1 


and there is therefore an equivalent associated matrix t(A) which has the form 

t(A) = ® 

( ' 0 UU) 

When 7 T (A) is reducible in this manner we have 


Ti(AB) = E l T(AB)E 1 = E 1 T(A)T(B)E 1 
= AAA) (A, + En)T(B)E 1 
= E l T(A)E 1 T(B)Ei = T { (A)T l (B) 

so (hat A (A) and 7h(A) are separately associated matrices. We may there- 
iore assume / (A) irreducible without loss of generality since reducible associated 
matrices may be built up out of irreducible ones by reversing the process 
used above. 


5.13 We shall now show that, if X is a scalar variable, then T{\) is a power 
of X. . To begin with we shall assume that the coordinates of T(A) are rational 
functions in X and that 7 7 (1) is finite; we can then set 7 7 (A) = 7 7 |(X)//(X) where 
/(X) is a scalar polynomial whose leading coefficient is I and the coordinates 
of 7 7 ,(X) are polynomials whose highest common factor has no factor in common 
with /(X). If ji is a second scalar variable, (48) then gives 

7\(X)7 7 i(g) _ 7',(Xg) 

' /(XS/Cu) / (Xg)‘ ’ 

hence /(Xg) is a factor of /(X)/(g), from which it follows readily that /(Xg) 
= /(X)/(g) ; so that /(X) is a power of X and also 

( 5 °) A(Xg) = 7 7 1 (X)7 7 1 (g). 

We also have /(l) = 1 and hence T^ln) = T( 1„) = l m . 

Let 7 7 i(X) = Fa + \Fi + • ■ • + \‘F S (F S ^ 0); then from (50) 

F 0 + XgA + • • • + X”g s A = (F 0 ■+ \F, + • • • ) (A'o + p.F, + ■ ■ ■) 
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which gives 

F\ = Fi, FiFj = 0 j), (i, j = 0, 1, • ‘ *)• 

Now 

r 1 (x)r(*4) = f(\)T(\)T(A) =/(x)r(xA) = T(A)T ,(x) ; 

therefore 

SF i T r (A)X i = 2T(A)F<X* 

and hence on comparing powers of X we have 

FiT(A) = T(A)Fi 

and, since ZFi = Ti(l) = 1 m and we have assumed that irreducible, 

it follows that every F{ = 0 except F s , which therefore equals 1,„. I buna* 
!Ti(X) = X 6 ' and, since /(X) is a power of X, we may set 

(51) T(X) = X r . 

Since T(XA) = T(X)T(A) = X r T(A), we have the following theorem. 

Theorem 9. If T(A) is irreducible, and if T(X) is a rational function <>j the 
scalar variable X, then T(X) = X r and the coordinates of T(A) arc honioijcucou s 
functions of order r in the coordinates of A. 

The restriction that T(X) is rational in X is not wholly necessary. for 
instance, if q is any whole number and e a corresponding primitive mol <>l I, 
then T' l (e) = l m and from this it follows without much difficulty if nil 7 r (<> *' 

where s is an integer which may be taken to be the same for any finite number 
of values of q. It follows then that, if T(X) = || b/(X) ||, (lie fund ions /..(X) 
satisfy the equation 

tij(e\) = eHij(X) 

and under very wide assumptions as to the nature of the functions (>, il follows 
from this that T(X) has the form X r . Again, if we assume that 7 T (X ) X ,r N 7 f f A r , 
then T(X)T(jjl) = T(Xjjl) gives immediately 

7V + " = T(n) 

so that only one value of r is admissible and for this value T r -- 1 as before. 

5.14 If the coordinates of T(A) are rational functions of Hie coordinates a tl 
of A, so that r is an integer, we can set T(A) = 1\(A )/f(A) when' the coordi- 
nates of T } (A) are integral in the a if and f(A) is a scalar polynomial in these 
variables which has no factor common to all the coordinates of 7 7 |(.l). As 
in (50) we then have 

Ti(AB) = T 1 (A)T 1 (B), f(AB) = f(A)f(B). 
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It follows from the theory of scalar invariants that f(A ) can be taken as a 
positive integral power of | A | ; we shall therefore from this point on assume 
that the coordinates of T(A) are homogeneous polynomials in the coordinates 
of A unless the contrary is stated explicitly. We shall call r the index of T(A). 

Theorem 10. If T(A) is an associated matrix of older m and index r, and if 
the roots of A are a l} a», * • •, a ni then the roots of T(A) have the form a [a o'* * * * 
•alf where 25r* = r. The actual choice of the exponents r depends on the particular 
associated matrix in question but , if a[ l al l * • * a r n n is one root , all the distinct 
quantities obtained from it by permuting the as are also roots. 

If the roots of A are arbitrary variables, then A is similar to a diagonal 
matrix Ay = 2a iCu. We can express T(A X ) as a polynomial 6 in the a’s, say 

(52) T(Af) = 2a [««{• ••• a r n nF rirt „ r, 

where the F’ s are constant matrices. If now B — 20iC;/ is a second variable 
diagonal matrix, the relation T{A Y B) = T(Ai)T(B) gives as in (50) 

jr 2 v » 

/ ko \ 1 ri r 'i ••• r H — i r { r., ... r n , 

K ' J ... .. = 0 ((r„ r s , •••) ^ («„ «. • • •)) 

and hence T(A/) can be expressed as a diagonal matrix with roots of the required 
form; these roots may of course be multiple since the rank of F Vl ... r , L is not 
necessarily 1, the elementary divisors are, however, simple. 

Since the associated matrices of similar matrices are similar, it follows that 
the roots of the characteristic equation of T(A) are given by those terms in 
(52) for which ... , u ^ 0; and, since this equation has coefficients which 
are polynomials in the coordinate's of A, the roots of T(A) remain in this form 
(wen when the roots of A are not necessarily all different. 

The rest, of the theorem follows from the fact that the trace of T(A\) equals 
that of T(A) which is rational in the coordinates of A and is therefore sym- 
metric in the a’s. 

Theorem 11. The value of the determinant of T(A) is | A and rm/n is 
an integer. 

For T(A )7 T (adjd ) = 7 T (| A |) = \A | r and therefore | T(A) | is a power of 
[ .1 |, say | .1 |\ But- T(A) is a matrix of order m whose coordinates are poly- 
nomials in (he coordinates of A. Hence sn = tnr and rm/n is an integer. 

5.15 Transformable systems. From a scalar point of view each of the 
associated matrices discussed in §§5.03-5.11 can be characterized by a set of 
scalar functions f k (fc = 1, 2, • * ■, m) of one or more sets of variables (&„ 

« i f wo merely assume that T(A X ) is a convergent scries of the form (52), equation (53) 
still holds. Tt follows that there arc only a finite number of terms in (52) since (53) shows 
that there is no linear relation among those F n ... r „ which are not zero. Let be the 
sum of those F n ... rn for which Sr; has a fixed value p t -; then T(\) = 2\ Pi F i} and as before 
only one value of pi is admissible when T(A) is irreducible. 
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j = 1, 2, • n), i - 1, 2, •••, r, which have the following properly: if the 

£’s are subjected to a linear transformation 


n 

it i = 2 Ci = 2 , • • •, »; * = 1 . 2 , • • • , '•) 

s = 1 


and if f' k is the result of replacing by £*/ in//,, then 

m 

f'k — 2/ 

8 = 1 

where the a’s are functions of the and are independent of (he $/s. 
instance, corresponding to C$(A) we have 


For 


/< 


for which 


(lp klq 


(p, <7 = 1, 2, •••, < (j) 


OLij — Oipq t rs 


dpr Cl qr 
& pa Cl qn 


We may, and will, always assume that there are no constant multipliers such 
that SA iji = 0. Such systems of functions were first considered by Sylvester; 
they are now generally called transformable systems. 

If we put T(A) = || an ||,. we have immediately T(A B) = T{A)T(B), mid 
consequently there is an associated matrix corresponding to every transformable 
system. Conversely, there is a transformable system corresponding to an 
associated matrix. For if X = || £;/ 1| is a variable matrix and e an arbitrary 
constant vector in the space of T(A), then the coordinates of T(X)e form a 
transformable system since T(A)T(X)c = T(AX)c and c can be so determined 
that there is no constant vector b such that SbT(X)c s (). 

The basis//,, (k = 1, 2, m) may of course be replaced by any basis 
which is equivalent in the sense of linear dependence, the result of such a change 
being to replace T(A) by an equivalent associated matrix. If in particular 
there exists a basis 


9u Oh * * " ? Oh, h, h 2 , • • *, /?/,, (/c, + A*2 = k) 

such that the g’s and the K s form separate transformable systems, then T(A ) 
is reducible; and conversely, if T(A) is reducible, there always exists a basis 
of this kind. 


5.16 Transformable linear sets. If we adopt the tensor point of view 
rather than the scalar one, an associated matrix is found to be connected with 
a linear set g of constant tensors, derived from the fundamental units a, such 
that, when is replaced by Ae { (i = 1, 2, • n) in the members of (he basis 
of $, then the new tensors are linearly dependent on the old; in other words 
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the set ft- is invariant as a whole under any linear transformation A of the funda- 
mental units. For instance, in the case of C 2 (A ) cited above, S' is the linear 
set defined by 

1 Wi I O', j = I, 2, • • •, n; i < j). 

\\c shall call a set which has this property a transformable linear set. 

Let Hi, a-,, ■ ■ ■ , u,„ be a transformable linear set of tensors of grade r and let 
u 'i be the tensor that results when c { is replaced by Acj (j = 1,2, • • •, n) in 
Since the set is transformable, we have 

u'i = 2 ajiiij = T(A)-Ui 0 = 1, 2, •••, m) 

j 

where the an are homogeneous polynomials in the coordinates of A of degree r. 
If we employ a second transformation B, we then have 

u'i = T(A)T(B)u if u'i = T{AB)ui (i = 1, 2, •••, m) 

and therefore T(A) is an associated matrix. 

We have now to show that there is a transformable linear set corresponding 
to every associated matrix. In doing this it is convenient to extend the notation 
Suv to the case where u and v are tensors of grade r. Let Ei (i = 1,2, • • • , n r ) 
be the unit tensors of grade r and 


u — v — 

any tensors of grade r; we then define Suv by 



where the numerical divisor is introduced solely in order not to disagree with 
the definition of §5.02. 

Let- Xi = wfi/’y (i = 1,2, ••• , r) be a set of variable vectors and A (i = 1, 
2, • • •, ,s*) the set of tensors of the form xi'xB • ■ * x*/ (2ji — r) ; we can then 
put any product £n" £(l» a • • • for which 20;/ = r in the form kSEiX j, k 
being a numerical factor. This can be done in more than one way as a rule; in 

fuel, if T flij = j8 i; then 


> jc? n . . . u 

sll sin ~ ^Tj & 


T Pl 

■f] 


and from the definition of /Say it is clear that the factors in Ci 11 ■ • • e„' n can 
be permuted in any way without altering the value of the scalar. It follows that 


H 

Sll 


. tP'' 
si n 


Pin 


1 

0ii ! 0i2 ! * * * 0nJ 


S 




x 


Pi 

i 
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and repeating this process we get 




S r n & 




e-n 

filn 


£ [ rM’ ■ ■ ■ si' 


where k L is a numerical factor whose value is immaterial for our present purpose's. 

If/ is any homogeneous polynomial in the variables £ l7 of degree p, il can be 
expressed uniquely in the form 




fe i e-n 1 .. lei c„ 1 ___ <P 

\ftl • • • 0Lnf \/Al ‘ ’ ‘ Arnf 


where the inner summation extends over the partitions of fit into fin, fit *.*, • * • , fi in 
(i = 1, 2, •••, r) and the outer over all values of fi h fi», fi r for which 
2fii — p. We may therefore write 

/ = 2 SF > X ‘ 


where, as above, Xj = • • • .r? r and 

J"f?i c n fc’i * v n ^ 

\/5ll * * * filnj \$rl * * ’ firnj 

The expression of / in this form is unique. In the first; place, Fj ^ 0 unless 
each <p 0n ... {j rn is zero, since the set of tensors of the form 


Fj = Ffrfa ... p r = 2^ u ... fim 


Cl 

fin 


c» 

Pm 


Cl 

Arl 


An 


(A/3.7 = p) 


are clearly linearly independent. Further, if ZtfFjXj r = 0, (lien (-.-tcli ,S 
is zero since each gives rise to terms of different type in the f, 7 ; and linally ihe 
form of Fj shows that SF,Xj = 0 only if Fj = () since, in 


. p-flll 

-^11 •• • flrnSll 


£ rn 

s nt 


SFjXj = k& 

each term of the summation is of different type in the £, 7 . 

Let (/,) be a transformable system; we can now write uniquely 

( 54 ) h = 2-Sf/VV; (/' = 1,2, m) 

i 

and we may set 

« r 

/■ = y; /a = 2 Wo v 

i o/ 

where /i = 0 when i > m. If we transform the x’s by A = || «, 7 || and denote 
II r (4) temporarily by IT, then X { becomes TLVy and F is transfornu'd into F* 
where 

< 55 ) F* = 2 EiSFtjnXi = 2 £VS 1 VFil ■ Xj. 
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But the /’ s form a transformable system and hence by this transformation /< 
becomes 


so that 


fi = 2 

Ic 

f* = 2) «<*/*#< = 2 ^ 2 aik 2 


Comparing (55) and (56) we have 


2 42 aikFki - n '^/l x i = 

j L * 


and therefore, as was proved above, each of the terms of the summation is zero, 
that is, 


r YF ti = 2 


and therefore, if j is kept fixed, the linear set 

(• r >9j (F ij, F-,j, •••) 

is transformable provided F^, /d> y , • • • are linearly independent. 

If there is no j for which the set (59) is linearly independent we proceed as 
follows. Idol fij — SFijXj so that 

f\ = /ii + /i2 + • * • + /l, 

J'i = /21 + / + • ■ * + / 


/mi == /mi I + /mi2 + 


+ /,„,. 


If the removal of any column of Ibis array leaves the now /,- so defined linearly 
independent, they form a transformable system which defines the same asso- 
ciated matrix as the original system; wo shall therefore suppose that the removal 
of any column leads to linear relations among the rows, the coefficients of these 
relations being constants. Remove now the first, column; then by non-singular 
constant combinations of the rows we can make certain of them, say the first 
mu (‘qua, l 0, the remainder being linearly independent. On applying the same 
transformation to the rows of (60), which leaves if still a transformable system, 
we see that we may replace (60) by an array of the form 


/mi I /mi 1 1 

fm -f- L /wi 1 -}■ ( r I 1 /mi 1 1 2 + • • • +/«!+!, 


fm = fm l + /m2 + * * * + fm 


(61) 
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where f mi + , - f nil + , (i = 1, 2, »h) are linearly independent. It 

follows that /i 3 •••, / Wl are transformed among themselves and so form a 
transformable system. For these functions are transformed in the same way 
as f n , U • • •, f mih and if the last m - m x rows of (61) were involved in the 
transformation, this would mean that fn, f mi i, when transformed, would 
depend on f mi + Xti etc., w T hich is impossible owing to the linear independence 

Of fim -f i fmi + 1 (f = L 2, ' ‘ 171 111 ) ) . 

Corresponding to the first column of (61) we have tensors F n , F« h * • *, F m \ 
and we may suppose this basis so chosen that Fn (i == .1,2, * • *, p) are linearly 
independent and Fj X — 0 for,?' > p; and this can be done without disturbing 
the general form of (61). If p = m, we have a transformable system ol the 
type we wish to obtain and we shall therefore assume that p < m . We may 
also suppose the basis so chosen that SFnFjj = 8ij (i, j = 1, 2, ■ * •, p) as in 
Lemma 2, §1.09. It follows from what we have proved above that Fu, F> Z \, 
* • F mi i is a transformable set. 

Let A be a real matrix, the corresponding transformation of the F 1 s being, 
as in (58) K 

(62) F*i = 2 = II f F iU (i = 1, 2, • • p); 

J 

we then have 

(63) P*, = 2) = n'(/l)f’ a 

3 

so that the Fa also forms a transformable set. Since Fu, **•, form a 
transformable set, an and an are 0 when i > m x and j < m\ no mailer wlm ( 
matrix A is. Now 

ctn = SF h Fa= SFj 1 U'(A)F. il = Sn(A)hn = fillWPj/i, 

which equals 0 for i < m h j > in x since by (63) n , (-4 , )/ y n is deriv(‘d from F J{ 
by the transformation A' on the x’s and for j < m Y is therefore linearly depend- 
ent on F ix (j = 1, 2, • • •, mi). Hence the last m — m\ rows in (61) also form 
a transformable system, which is only possible if the system /,, f>, ■ • f m is 
reducible. If T(A) is irreducible, the corresponding transformable system is 
irreducible and it follows now that there also corresponds to it an irreducible 
transformable set of tensors. 

5.17 We have now shown that to every associated matrix T(A) of index r and 
order m there corresponds a transformable linear set of constant tensors 
F h Pi, * • •, Fm of grade r whose law of transformation is given by (62). Also 
since I I' (A) = II (A'), we have 

(64) ILF; = 2a' ik F k , n F\ = 2a' ik F k 

where T{A f ) = || ||. 
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Since J'i, /' 2 , • • • , F m are linearly independent, we can find a supplement to 
this set in the set of all tensors of grade r, say 

Gi, G«, • • ■ , (fx = n r — m) 

such that 


( 65 ) sFiG,- = 0 . 

It is convenient also to choose bases for both sets such that 

(650 Sf\ Fj = Si] = SGiGj. 

Since the two sets together form a basis for the space of n, we can set 

IV G] = SftyJF* + 2 7 * A 

and this gives 


I3,j = SFiU'G, = SGjUFi 

which is 0 from (64) and (65), hence the G’ s are transformed among themselves 
by 1 1'. This means, however, that 11' is reducible, and when it is expressed 
in terms of the basis (F\, F m , G h •••, Gf), the part corresponding to 
(F h • • *, F m ) has the form || an || and is therefore similar to T(A). Hence: 

Theorem 12. Every irreducible associated matrix T(A ) of index r is equiva- 
lent to an irreducible part of II, (.4), and conversely. 

5. IS Irreducible transformable sets. If F is a member of a transformable 
linear set ft = (F i, /'\>, • ■ •, F m ), the total set of tensors derived from F by all 
linear transformations of the fundamental units clearly form a transformable 
linear set which is contained in ft, say ft,; and we may suppose the basis of ft 
so chosen that ft, = (F ,, F,, • • ■, F k ) and SF\Fj = o iS (i, j = 1,2, • • *, m). 
Let G be an element of (F k + ,, ■ ■ F m ) and G' a transform of G so that 


G' = 



r Lhen SFf r' = y t -. But SF/F = SF G, where F\ is the transform of Ft obtained 
by the transverse of the transformation which produced G' from G so that F\ 
is in ft j for i < k. Hence y »■ = 0 for i = 1, 2, •, k, that is, (F k ,, • • •, F m ) 

is also a transformable set; and so, when the original set is irreducible, we must 
have ft] = ft. If we say that F generates ft, this result may be stated as follows. 


Lemma 5. An irreducible transformable linear set is generated by any one of 
its members. 


We may choose F so that it is homogeneous in each for if w r e replace, say, 
Gi by \c u then F has the form '2\ k H /c and by the same argument as in §5.13, 
any Ii k which is not 0 is homogeneous in ei and belongs to ft. A repetition of 
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this argument shows that we may choose F to be homogeneous in each of the 
fundamental units which occur in it. If r is the grade of F, we may assume 
that F depends on e lf e 2) * • •, e s , and, if Jc 2 , • • k 8 are the corresponding 
degrees of homogeneity, then = r and, when convenient, we may arrange 
the notation so that k 2 > k 2 > * * • > Jc 8 . 

If we now replace e x in F by e x + Xc* (i > s), the coefficient II of X is not 0, 
since i > a, and H becomes k x F when e x is replaced by e x ; it therefore forms a, 
generator of $ in which the degree of e x is one less than before. It follows that, 
when r < n, we may choose a generator which is linear and homogeneous in r 
units e h e 2 , •••, e r . It is also readily shown that such a tensor defines an 
irreducible transformable linear set if, and only if, it forms an irreducible sot 
when the transformations of the units are restricted to permuting the first. 
r e’s among themselves. Further, since the choice of fundamental units is 
arbitrary, we may replace them by variable vectors z h x 2} ••*, .r,. For 
instance, the transformable sets associated with n r , P r and C r are X\X 2 • • • .r r , 
[x\x 2 • • • and | x x x 2 * * • x r \, respectively, and of these the first is reducible 
and the other two irreducible. 

5.19 It is not difficult to calculate directly the irreducible transformable sots 
for small values of r by the aid of the results of the preceding paragraph. If 
we denote x 1} x 2j • • • by 1, 2, • •, the following are generators for r = 2, M. 



generator 

r — 2 order 

2.1 

{12} 

n(n + l)/2 

2.2 

1 12 | 

n{n - l)/2 

r = 3 

3.1 

{123} 

n(n + 1)(« + 2 )/« 

3.2 

| 1 | 23 | | 

nW ~ D/3 

3.3 

1 1 {23} | 

«(/i 2 — l)/3 

3.4 

1 123 | 

n(n - 1 )(/t - 2)/(i. 


This method of determining the generators directly is tedious arid the follow- 
ing method is preferable. 7 Any generator has the form 

Wi = Swiji,... i r x 1 : l Xi 2 • • • Xi 

and if q ix ... t - denotes the substitution ( * * ’ [ \ we mav write 

Vl ^2 l r J 

W] ... i/liyii ... i f X j.'Tj * ‘ • X r 

= q.\(XiX 2 • • • X T ) 

where q x may be regarded (see chap. 10) as an element of the algebra S whose 
units are the operators q of the symmetric group on r letters. Now w x gener- 
ates a transformable set and hence, if = q { (xi • • • x r ) (■ i = 1,2, • - •) is a 

7 Fuller details of the actual determination of the generators will ho found in Woyl: 
Gruppentheorie und Quantentheorie, 2 ed. chap. 5. 
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basis of the set, and Q is the set of elements q h q 2 , • • • in S, then the set of 
elements Qq = (qiq, q 2 q , • • • ) must be the same as the set Q , that is, in the 
terminology of chapter 10, Q is a semi-invariant subalgebra of S; conversely 
any such semi-invariant subalgebra gives rise to a transformable set and this 
set is irreducible if the semi-invariant subalgebra is minimal, that is, is con- 
tained in no other such subalgebra. 

It follows now from the form derived for a group algebra such as S that we 
get all independent generators as follows. In the first place the operators of S 
can be divided into sets 8 Sk (k = 1, 2, •••,£) such that (i) the product of an 
element of Sk into an element of Sj (k j) is zero; (ii) in the field of complex 
numbers a basis for each S/ c can be chosen which gives the algebra of matrices 
of order n 2 k ; and in an arbitrary field S is the direct product of a matric algebra 
and a division algebra; (iii) there exists a set of elements u k i, u/ C 2 , u k » k 

in Sk such that. ^ u ki is the identity of S k and u 2 ki = uh p* 0, UkiUkj = 0 (i p* j) 

i 

and such that the set of elements UkiSiMki is a division algebra, which in the 
case of the complex field contains only one independent element; (iv) the 
elements of Sk can be divided into v k sets u k Sk (i = 1,2, * * *) each of which 
is a minimal semi-invariant subalgebra of S and therefore corresponds to an 
irreducible transformable set. 


H It is shown in the theory of groups that t equals the number of partitions of r. 



CHAPTER VI 
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6.01 Hermitian matrices. If we denote by x the matrix which is derived 
from x by replacing each coordinate by its conjugate imaginary, (hen x is 
called a hermitian matrix if 

(1) x — x'. 

We may always set x — x± + ix^ where Xi and x 2 are real and (1) shows that., 
when x is hermitian, 

(2) x[ = Xij x 2 — — x 2) 

so that the theory of real symmetric and real skew matrices is contained in that 
of the hermitian matrix. The following are a few properties which follow 
immediately from the definition; their proof is left to the reader. 

If x and y are hermitian and a is arbitrary, then 

x + V> 2, axa', xy + yx, i(xy - ?/.r), 

are all hermitian. 

Any matrix x can be expressed uniquely in the form a + ih where 2a 
x + x\ 2b = ~i(x — x f ) are hermitian. 

The product of two commutative hermitian matrices is hcrmilian. In 
particular, any integral power of a hermitian matrix x is hermilian; and, if 
000 is a scalar polynomial with real coefficients, g(x) is hermilian. 

Theorem 1. If a, b, c, * • * arc hermitian matrices s ach that <(' + //' -f r’ -|- 
* • • =0, then a, 6, c, * • • are all 0. 

If 2a2 = °> its trace is 0; but 2a 2 = Van' and the trace of (ho laller is (lie 
sum of the squares of the absolute values of the coordinates of a, />, • • • ; heii(*c 
each of these coordinates is 0. 

Theorem 2. The roots of a hermitian matrix are real and its element ary 
divisors are simple. 

Let x be a hermitian matrix and g{ X) its reduced characteristic function. 
Since g(x) = 0, we have 0 = g(x) = g( x ’) and, since x and *' have ihc same 
reduced characteristic function, it follows that g(\) = { j(x), that is, the cocfli- 
cients of g are real. Suppose that & = a + i/3 (0 ^ 0) is a root, of ./(X); (hen 
& = a — i/3 5 * ft is also a root, and we may set 

(3) g(. X) = (X - fi)( + ig 2 (\)) = (X - - ig,(\)) 

88 
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where Q\, g> are real polynomials of lower degree than g, neither of which is 
identically 0 since g is real and & complex. Now 

tffV)] 2 + [ffaW? = [gi(x) + igz(x)][gi(x) - ig^x)] 

anti this product is 0 since from (3) X — is a factor of gi(\) — ig*(\) and 

(X ~ si)(<7i(X) + ifl'sCX)) = g(X). But, since the coefficients of cj\ and g 2 are 

real, the matrices gf x), g*(x) are hermitian and, seeing that the sum of their 
squares is 0, they both vanish by Theorem 1. This is however impossible 
since g t (\) is of lower degree than the reduced characteristic function of x. 
Hence x cannot have a complex root. 

To prove that the elementary divisors are simple it is only necessary to show 
that {/(A) has no multiple root. Let 

f/(X) = (X - *)'ft(X), A({) 3 * 0. 

If r > 1, set < 7 i(X) = (X — £) r ~ ^(A); then [^i(X)] 2 has g(\) as a factor so that 
the square of the hermitian matrix g x {x) is 0. Hence by Theorem 1, gfx) is 
itself 0, which is impossible since the degree of is less than that of g. It 

follows that r cannot be greater than 1, which completes the proof of the 

theorem. 

Since the elementary divisors are simple, the canonical form of x is a diagonal 
matrix. Suppose that n — r roots are 0 and that the remaining roots are 
so £ 2 , • • *, these are of course not necessarily all different. The canonical 
form is then 


t 


s r 


0 


0 . 

The following theorem is contained in the above results. 

Theorem M. A hermitian matrix of rank r has exactly n — r zero roots. 

It also follows immediately that the characteristic equation of x has the form 

x n — aix" “ 1 + * • • + ( — l) r n r .r" ~ r = 0 (a r 5 ^ 0) 

where a* is the elementary symmetric function of the £’s of degree i. Since a r 
is the sum of the principal minors of x of order r, we have 

Theorem 4. In a hermitian matrix of rank r at least one principal minor of 
order r is not 0. 
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In view of the opening paragraph of this section Theorems 14 apply also 
to real symmetric matrices; they apply also to real skew matrices .except, t ha t 
Theorem 2 must be modified to state that the roots are pure imaginaries. 

6.02 The invariant vectors of a hermitian matrix. Let II be a hermilian 
matrix, a h oc 2 two different roots, and a h a 2 the corresponding invariant vectors 
so chosen that Sa^i = 1; then, since Ha x = ai a l} Hch = ayh, we have 

aiSdzCii — Sa 2 Hd] = SH'a 2 di = a 2 Sa 2 di 

and, since ol x ^ a 2 , we must have Sa 2 a x = 0. Again, if a is a repeated roof ol 
order $ and a t , a 2 , *••, a s a corresponding set of invariant vectors wo may 
choose these vectors (cf. §1.09) so that Saidj = 8 */. The invariant vectors 
may therefore be so chosen that they form a unitary set and 

(4) H = HoLidiSdi. 

If U is the matrix defined by 

(5) U — cti (i = 1, 2, * ■ * j n) j 
then 

(6) UU f = 1, 

r 

so that TJ is unitary, and if A is the diagonal matrix ^ aAvSVv, (lien 

i 

(7) H = UAU' = UAU-K 
We may therefore say: 

Theorem 5. A hermitian matrix can he transformed to its canon i col form Inj 
a unitary matrix. 

If Ii is a real symmetric matrix, the roots and invariant vectors are real, 
and hence U is a real orthogonal matrix. Hence 

Theorem 6. A real symmetric matrix can be transformed to its canonical 
form by a real orthogonal matrix. 

If T is a real skew matrix, h = iT is hermitian. The non-zero rools of T 
are therefore pure imaginaries and occur in pairs of opposite sign. The invari- 
ant vectors corresponding to the zero roots a, re real and hence by (Ik* proof 
just given they may be taken orthogonal to each other and to each of the 
other invariant vectors. Hence, if the rank of T is r, we can find a real orthogo- 
nal matrix which transforms it into a form in which the last n - r rows and 
columns are zero. 

Let ia. be a root of T which is not 0 and a = b + ic a corresponding invariant 
vector; then Ta — iaa so that 

Tb = —ac, Tc — ab. 
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aSc 1 2 = ScTb = -SbTc = -aSb 2 , -aSbc = SbTb = 0, 

which gives 

Sb 2 = Sc 2 , She = 0. 

Wc can then choose a so that Sb 2 = Sc 2 = 1 and can therefore find a real 
orthogonal matrix which transforms T into 

0 a\ 

— ai 0 

0 a<z 

® “<*2 0 = 2oij(e 2 i - lSe-v — e 2 jSe‘2i ~ i). 


We have therefore the following theorem. 

I n iso hem 7. If T is a real skew matrix , its non-zero roots are pure wiaginaries 
and occur in pairs of opposite sign ; its rank is even ; and it can be transformed 
into the form (8) by a real orthogonal matrix. 

6.(V> Unitary and orthogonal matrices. The following properties of a unitary 
matrix follow immediately from its definition by equation (6). 

The product of two unitary matrices is unitary. 

1 he transform of a herrnitian matrix by a unitary matrix is hermitian. 

The transform of a, unitary matrix by a unitary matrix is unitary. 

If If arid 1 1 2 are hermitian, a short calculation shows that 


0 ) 


If 


1 - ill i 

1+ ill f 


U 2 


HI 2 - 1 

m 2 + i 


are unitary (the inverses used exist since a hermitian matrix has only real 
roofs). Solving (9) for If and II 2 on the assumption that the requisite inverses 
exist we get. 


// 1 


i((f - 1 ) 
v i +T’ 


II 2 = 


vYH- l) 
if - 1 * 


1 hose a.re hermitian when If and If are unitary, and therefore any unitary 
matrix which has no root, equal to — 1 can be put in the first form while the 
second can be used when U has no root equal to 1. 


Theorem 8. The absolute value of each root of a unitary matrix equals 1. 
Let a + ifi be a root and a + ib a corresponding invariant vector; then 
U (a + ib) — (a + iP) (a + ib), Li {a — ib) — (a — ip) (a — ib). 
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Hence 

Sa 2 + Sb 2 = 5 ( a + *)( a _ ib ) = S (a + ib)U'V(a - ib) = SU(a + ib)( (a - ib) 

— (a 2 + j3 2 )S(a + ib)(a — ib) = (a 2 + P 2 )S(a 2 + lr), 

so that a 2 + /3 2 = 1. 

Corollary. 

U~ 1 (a + ib) = (a - i/3) (a + ib ), 

( 10 ) _ 

U'(a — ib) = U~ l (a — z*6) = (a + ijG)(a — ?•&). 

Theorem 9. The elementary divisors of a unitary matrix arc simple. 

For, if we have 

U(a x + ibi) = (a + iff) (cti + ibi), U(a 2 + ibf) = (a + ^)(a« + dh) + (<t\ + //>,), 
then from (10) 

(a + i@)S(a } — t&i)(a 2 + ibi) = SU'(ai — ibi)(a 2 + ibi) = /S'(ai — ibi)l'(a 2 + if hi) 

= (a + ifi)S(ai — i6i)(a 2 + i& 2 ) +■ $(aj — £&i)(«i + f/h) 

which is impossible since S(a x — ibi) (a x + ibi) = Sa] + Sb] ^ 0. 

The results of this section apply immediately to real 1 orthogonal matrices; 
it is however convenient to repeat (9). 

Theorem 10. If U is a real orthogonal matrix , it can be expressed in the form 
(1 + T)/(l — T) if it has no root equal to 1 and in the form (T — I ) /( 7 T + 1 ) 
if H has no root equal to —1, the matrix T being a real skew matrix in both cases; 
and any real matrix of this form which is not infinite , is a real orthogonal matrix . 

6.04 Hermitian and quasi-hermitian forms. Let II be a hermilian matrix and 
x = u + iv a vector of which u and iv are the real and imaginary parls; 
then the bilinear form / = SxIIx is called a hermitian, form. Suds a, form is 
real since 

/ = SxIIx = SxII'x = SxIIx = f. 

In particular, if x and II are real, / is a real quadratic form and, if II = iT is a 
pure imaginary, T is skew and / = 0. 

If we express II in terms of its invariant vectors, say II No an, No, and 

then put x = the form f becomes / = 2a£& { . Tins shows that, if all 

the roots of II are positive, the value of /is positive for all values of ;r; II and / 
are then said to be positive definite . Similarly if all the roots are negative, 
H and / are negative definite. If some roots are 0 so that / vanishes for some 
value of x ^ 0, we say that II and / are semi-definite , positive or negative as 
the case may be. It follows immediately that, when II is semi-definite, SxIIx 
can only vanish if Hx = 0. 


1 The first part of the theorem applies also to complex orthogonal matrices. 
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Theorem 11 . If H and K are hermitian and H is definite , the elementary 
divisors of //A — K are real and simple. 

Since H\ — K and — ( II\ — K ) are equivalent, we may suppose that H is 
positive definite. Its roots are then positive so that 

IP = 15a\aiSdi 

has real roots and hence is also hermitian so that is hermitian. But 


H\ ~ K = Wifi - IWKH-^W 

so that //A — K is equivalent to X — which has real and simple ele- 

mentary divisors by Theorem 2. 

In order to include the theory of complex symmetric matrices we shall now 
define a type of matrix somewhat more general than the hermitian matrix and 
closely connected with it. If A = A{ X) is a matrix whose coefficients are 
analytic functions of a scalar variable X, we shall call it quasi-hermitian if 


(11) A'(\) = M( — A). 

For convenience we shall set M"(A) for A{ — X) with a similar convention for 
vector functions. 

If A = B + AC, B and C being functions of X 2 , then A " = B — AC so that, 
if /I is quasi-hermitian, B is symmetric and C skew just as in the case of a 
hermitian matrix except that now B and C are not necessarily real. If A is 
any matrix, 

2 P' s A ' + A* = 2 P\ 2 Q f s {A' - A")/\ = 2 Q" 


so that any matrix can be expressed in the form P + X Q where P and Q are 
quasi-hermitian. 

If x = u + \v, where u and v are vectors which are functions of X 2 and if A 
is quasi-hermitian, then 


(.12) /(X) = Sx ff Ax =/( — X) 

is called a quasi-hermitian form. Again, if | 1 + X A | ^ 0, and we set 
<2=(1- AM)/(1 + AM), then 


so that 


(V __ (1 ~ XAQ 
1 + \A f 


l - AM" 
1 + AM" 


m~ i 


(13) Q'Q" = 1. 

We shall call such a matrix quasi-orthogonal. 

6.05 Reduction of a quasi-hermitian form to the sum of squares. We have 
seen in §5.06 that any matrix A of rank r can be expressed in the form 




&4 \x a 
sVs SxJ~y 3 


( 14 ) 
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where 


A s + i 


— A$ A s y s 


SA' s x s 

SXsAslfs 


Ai = A, Sx a A*y* 0 . 


and the null space of A s + l is obtained by adding ( 3 / 1 , y 2 , * * *> V*) the null 
space of A and the null space of A' s + x by adding (: x%, ■ * •, x») to that, of A/. 

Suppose now that A is quasi-hermitian and replace y a , x 9 by z itJ z„ and set 
z 3 = u s + \v s , A s = B s + XGs so that 

= Su s B s u a \ 2 (2 Su s C s Vs Sv s B 3 v H ) 


and, so long as A 8 is not 0, we can clearly choose z s so that Sz[ A sZh ^ 0. Kach 
matrix A s is then quasi-hermitian since A [ = 1,, and 


( 15 ) 




AsZsSAWs 

Sz" 8 A s z a 


If x is an arbitrary vector and 

/ = /(x 2 ) = SxAz", <p s (X) = SxAsZ s = ^(X 2 ) + X x ,(X 2 ) 


then and Xs are linear functions of the coordinates of x which are linearly 
independent and 

, _ v SxAjJSAy.-x" _ V ^,(XVI(X) _ ^ ^(X 2 ) - x\;('x 2 j 
{b} 1 ^ Sz’UsZs ^ Sz’Xz, ^ Hz: A A, 

which is the required expression for /(X 2 ) in terms of squares. 

If A is hermitian, then X = i and \ p H} x«> Sz" s A N z s = are read and, 

if SzsA s z s = a~ l , (16) becomes 

( 17 ) / = j] 

i 

If X = 0, then A is symmetric and 

r 

(18) / = SxA. r = J ««**. 

1 

where the terms are all real if ^1 is real. 

In terms of the matrices themselves these results may be expressed as follows. 


Theorem 12. If A is a hermitian matrix of rank r, there exist an infinity of 
sets of vectors a s and real constants a s such that 
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and, if A is symmetric, there exists an infinity of sets of vectors a, and constants 
a* such that 

r 

(20) A = ^ a s a s Sa s 

i 

a« and a s being real if A is real. 

If 7 r of the a’s in (19) are positive and v are negative, the difference a = rr — v 
JkS called the signature of A. A given hermitian matrix may be brought to the 
form (19) in a great variety of ways but, as we shall now show, the signature is 
the same no matter how the reduction is carried out. Let Ki be the sum of 
the terms in (19) for which a s is positive and — AL the sum of the terms for 
which it is negative so that A = K r — A 2 ; the matrices Ah and A 2 are positive 
semi-definite and, if k x and k 2 are their ranks, we have r = ki + k 2 . Suppose 
that by a different method of reduction we get A = M x — Mo where M x and 
M' Z are positive semi-definite matrices of ranks nil and m 2 and m x + = r; 

and suppose, if possible, that k 2 < vn 2 . The orders of the null spaces of iv 2 
and Mi relat ive to the right ground of A are r — k 2 and r - m x = m 2 and, 
since r — + m 2 > r, there is at least one vector x in the ground of A which 

is common to both these null spaces, that is, 

Ax — Kix = —M 2 x 0, 

and hence SxKiX = —SxM 2 x. But l)oth Ah and Mo are positive semi-definite; 
hence we must have Sl'K^x = 0 which by §6.04 entails Ah:r = 0. We have 
therefore arrived at. a contradiction and so must have k* = m 2 which is only 
possible when the signature is the same in both cases. 

In the case of a, skew matrix the reduction given by (16) is not convenient 
and it is better to modify it, as follows. Let A' = — A and set 


! j , A si/sSA ,..r, _ 

(21 ) * + 1 * B A«y« Xx s A s y e 

A i = A , Sx,A 0 . 

So long as A 8 0, the condition Sx. t A 0 can always be satisfied by a 

suitable choice of x, and v/. s and it, is easily proved as in §5.06 that the null 
space of A h i is obtained from that of A }< by adding x„, y a \ also A s is skew so 
that, w e must, necessarily have x a ^ y a . It follows that the rank of A is even 
and 


(22) A 


r/2 

S 


A a x a RA a y a — A a y a SA a x a 
Sx a A a y a 


— Sa: s (tt2s — l^C?2s — U^aSaos —1 ) 


where each term in the summation is a skew matrix of rank 2 and 


a 2s — i A3X8) 


a 8 l = Sx a A a y a , 


a 2s A s y i j. 
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This form corresponds to the one given in Theorem 12 for symmetric matrices. 
If we put 


rf 2 


T = ~ e 2sSe is - l) — 


r/2 


(23) R = 2 0is(e 2 s^iSe 2s - c^-i) = 

i 

r/2 r/2 

P = ^ (a s a 2s /Se 2s + Ou-iSe% 9 - 1), Q = (%s>SV' 2 , + n. 2( , - iSV.,, ,) 


then (22) may be put in the form A — PTP' = When r = //, tho < l(ri er- 

minant of T equals 1 and therefore \ A \ = | P | 2 . The following theorem 
summarizes these results. 


Theorem 13. If A is a skew matrix of rank r, then (i) r is even; (ii) -1 can he 
expressed by rational processes in the form 

r/2 

(24) A = ^ ot. s (a 2s _ \Sa 2s — a 2s Sci 2s _ i) = PTP' = QRQ 

i 

where P, Q, R and T are given by (23); (iii) if r = n, the determinant of ri is a 
perfect square , namely | P | 2 ; (iv) if x and y arc any vectors and w -------- 

| a 2s _ ia 2s |, then 

(25) SxA?/ = & | xy | w. 

The following theorem contains several known properties of hermit inn 
matrices. 


Theorem 14. If T(A) is an associated matrix for which T'(A) 7\. T), 

w/jen A fs quasi-hermitian , 7\A) is afeo quasi-hermit ian. 

For A' = A" gives r'(A)- = r(A') = T(A") = 7 7// (ri). 

Particular cases of interest are: If A is hermitian, 7 7 (A) is hermitian. If 
T(pA) = p s T(A) and A is skew, then T(A) is skew if ,s* is odd, symmetric if 
s is even. 


6.06 The Kronecker method of reduction. Let A - ^ xiSy, |>c a quasi- 

i 

hermitian matrix of rank r; then 

(26) ZyiSxi = A' = A" = 2x"Sy'', 

from which it follows that y { is linearly dependent on xf, xj, • • *, x*, say 

r 

Vi = 2) I I ^ 0, (i = 1, 2, • • •, r). 

? = i 
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Using this value of y t we have 

A = ZqijXiSx'l, A' = Zqax'lSx;, A" = ^qhxlSxj 
and therefore 

< 2? ) M - * 

Further, since | q iS \ ^ 0, we can find s, 7 (i, j = 1, 2, • • •, r) so that 

/ j QijSjk — Oik 
3 

and then (27) gives s, 7 = sL. 

Let a-j, • • •, a r , x r + i, • • x n be a basis and Zj, s 2 , • • -,z n the reciprocal basis. 

r 

Then, if u\ = ^ s ;i z", the basis reciprocal to y", • ■ y", x T + i, • • •, x n is w'{, 

1 

"? tv r j Zr -j- ij *j Hence 

r 

P = ^ Wi&St = 2SijZ"SZj 


is quasi-hermitian. Further, if w = ^ then Sm^Pw = y ; and 

i 

can cl loose w so that this form is not 0. We also have 

r r r 

A P = ^ ^ ^ ariNz,-, 


we 


wliei ice H/b/, = ?/. 
Let 


(28) 


1 a 4- i 


A t 


/S U^PaUa 


A x = -1, P, = P, 


where is formed from A « in 1 lie same way as P is from H and is a vector 
of the left ground of A s such that Su"P f{ u s 0; also, as above, A s PsU — u 
for any vector u in the left ground of A H and A s in quasi-hermitian. The 
vee.t.or u" belongs to the right ground of A s and therefore every vector of the 
null space of A * lies in the null space of A a + i; also 


A a + 1 P silts 


= AaPaUa — 


Su"P a u 
VS JSu’iPaU 


s 


3 


= U s — U s = 0. 


Hence the null space of A s + 1 is derived from that of A s by adding P s u 3 to it. 
It then follows as in §6.06 that A can be expressed in the form 
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which is analogous to (36) and may be used in its place in proving theo- 
rem 12. 

We may also note that, if Q is the matrix defined by Q f ej — O' “ I; 2, 
•• *, n), then 

r 

A = Q' 2) = Q' B Q" 

1 

r 

where 13 is the quasi-hermitian matrix ^ qaeSej. It may be shown by an 

i 

argument similar to that used for hermitian matrices that a basis for the x\s 
may be so chosen that Q is quasi-orthogonal provided A is real. 


6.07 Cogredient transformation. If SxAy and SxBy are two bilinear forms, 
the second is said to be derived from the first by a cogredient transiorma- 
tion if there exists a non-singular matrix P such that SxAy ss SPxBPy , that is, 


(30) 


A = P'BP . 


When this relation holds between A and B, we shall say they are cogredient . 
From (30) we derive immediately A' = P'B'P and therefore, if 


R = 

U = 


A+A ' 
2 

B + B f 
2 


= R', 


- u', 


s = 

V = 


A - A' 
2 

B - B' 
2 


-S', 

-V’, 


then 

R + \S = P\U + X V)P 


so that R + AN and U + AF are strictly equivalent. 

Suppose conversely that we are given that R + AN and U + XI', which are 
quasi-hermitian, are strictly equivalent so that there exist constant non-singular 
matrices p, q such that 


R + AN - p(U + A V)q 


or 


(31) R = pUfjj S = pi r q\ 

then, remembering that R and U are symmetric, N and V skew, we have 

R = q f Up\ S = r/F//. 

Equating these two values of R and N, respectively, we get 
W)~ l pU = Up r q-\ (q')~ l pV = Vp'q- 1 
or, if W stands for U or F indifferently, and 

(32) J = ( q ')~ l P, 
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we have 


JW = WJ', 

repeated application of which gives 


.vST lVUTg 0 "■ 
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J r w = way. 

From this it follows that, if /(X) is a scalar polynomial, 

( 33 ) f(J)W = Wf(J') = W(f(J))\ 

In particular, since | J | ^ 0, we may choose /(X) so that f(J) is a square root 
of J and, denoting this square root by K , we have KW = IFA' or 

F = /v-^F/O, X 2 - J, (F = U or F). 

Using this in (31) we have 

R = pK-'UK'q , £ = pK-WK'q 


and from (32) ;> = q'J = ^'A 2 or 

pA“ l = g'A = (A'g)'. 


Hence, if we put P' = (/'A, there follows 


P = P'C/P, aS = P'VP 
or 

d - P + S = P'(P + F)P = P'PP. 

We therefore have the following theorem, which is due to Kronecker. 

Theorem 15. A necessary' and sufficient condition that A and B he cogredient 
is that A + X/l' and B + XB' shall he strictly equivalent. 

If A and B are symmetric, these polynomials become A(\ + X) and P(1 + X) 
which are always strictly equivalent provided the ranks of A and B are the 
same. Hot ice quadratic forms of the same rank are always cogredient, as is 
also evident from Theorem 12 which shows in addition that P may be taken 
real if the signatures arc the same. 

The determination of P from (31) is unaltered if we supposed symmetrical 
instead of skew, or R skew instead of symmetrical. Hence 


Theorem 10. If R, S, l r , V are all symmetric or all skew , and if R + \S 
and U + Xl r are strictly equivalent , we can find a constant non-singular matrix P 
such that 


R + X*S = P'(U + XV) P, 


that is, the corresponding pairs of forms are cogredient . 

In the case of a hermitian form SxAz changing x into Px replaces A by 
P'AP and we have in place of (30) 

(34) A = P'BP. 
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If we put B = 2p s b s Sb s , then 

P'BP = '£/3 s P , b s Sb s P = 2t3 s P%SP% = SftA-S c\,, 

where c s = P'ft. Equation (34) can therefore hold only if the signature as well 
as the rank is the same for B as for A. Conversely, if A = Ycv,u^SV7 ( , and 
A and B have the same signature and rank the notation may be so arranged 
that a s and ft have the same signs for all s; then any matrix for which 

p% = (j)V (s = 1,2> r) ’ i p i^° 

where r is the common rank of A and B, clearly satisfies (33). 2 Hence 

Theorem 17. Two hermitian forms are eogredient if, and only if, they have 
the same rank and signature. 

The reader will readily prove the following extension of Theorem 16 by the 
aid of the artifice used in the proof of Theorem 11. 

Theorem 18. If A, B, C , D are hermitian matrices such that A + \B and 
C + \D are (i) equivalent (ii) both definite for some value of A, there exists a con- 
stant non-singular matrix P such that 

A +\B = P\C + XD)P. 

6.08 Real representation of a hermitian matrix. Any matrix II = A + iB 

in which A and B are real matrices of order n can be represented as a real 
matrix of order 2 n. For the matrix of order 2 


0 

-1 

1 

o 


satisfies the equation i\ = — 1 and, on forming the direct product of t he original 
set of matrices of order n and a set of order 2 in which z 2 lies, we get a set of 
order 2 n in which H is represented by 

— A - f- ioB = 

As a verification of this we may note that 


A -B 


C -D 


AC - BD -(AD + HC) 1 

B A 


D C 


AD + BC AC - BD 


which corresponds to 

(A + iB)(C + iD) = AC — BD + i(AD + BC). 

2 The proof preceding Theorem 15 generalizes readily up to equation (33); at that, 
point, however, if K = f(J), we require Kf =/(/'), which is only true when the coeflic.icnts 
of f{\) are real. 
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1 his representation has the disadvantage that a complex scalar a + i(3 is 
represented by 


a 


n 

a 


which is not a scalar matrix although it is commutative with every matrix of 
the lorm §. Consequently, if H has a complex root, this root does not corre- 
spond to a root of If, however, all the roots of Ii are real, the relation 
II K = aK is represented by vSp® = a® when a is real so that a is a root of 
both II and 

To prove the converse of this it is convenient to represent the vector x + iy 
in the original space by (x, y ) in the extended space. Corresponding to 

(A + iB)(x + iy) = Ax — By + i(Bx + Ay) 


we then have 


A 

-B 

B 

A 1 


(x, y) = (Ax — By, Bx + Ay). 


If therefore has a real root a and (x, y) is a corresponding invariant vector 
so that 


we have ' 


which gives 


§ 0 , y) = ot(x, y) = (ax, ay), 


Ax — By = ax, Bx + Ay = ay, 


(A + iB) (a: + iy) = a(x + iy). 


It follows that invariant vectors in the two representations correspond provided 
they belong to real roots. This gives 

Theorem 19. To every real root of II = A + iB there corresponds a real 
root of 

/l -B 
B A 


V S 0 


and vice-versa. 

In this representation II and IV correspond to 

A B A' -1 r 

— B A ; ir A ' 

respectively, and hence, if II is hermitian, B' = —B so that v^> is symmetric. 
The theory of hermitian matrices of order n can therefore be made to depend 
on that of real symmetric matrices of order 2 n. For example, if we have proved 
of real symmetric matrices that they have real roots and simple elementary 
divisors, it follows that the same is true of hermitian matrices, thus reversing 
the order of the argument made in §6.01. 


CHAPTER VII 


COMMUTATIVE MATRICES 

7.01 We have already seen in §2.08 how to find all matrices commutative with 
a given matrix x which has no repeated roots. We shall now treat the some- 
what more complicated case in which x is not so restricted. If 

(1) xy = yx 

then x r y = yx r so that, if /(A) is a scalar polynomial, then f(x)y = yf(x). In 
particular, if is a principal idempotent element of x, then fty — yfi. Remem- 
bering that 2fi = 1 we may set 

x = ZfiX = 2xi, y = Xf { y = 2y i} 

and also, by §2.11, Xi = A*/; + z iy where Zi is nilpotent. Since y&j = 0 = x % yj 
(i j) y the determination of all matrices y which satisfy (1) is reduced to 
finding y so that 


yiXi = Xi y i} yi = Uj = yfi. 

We can therefore simplify the notation by first assuming that x has only one 
principal idempotent element, 1, and one root which may be taken to beO 
without loss of generality; x is then nilpotent. 

Let e x , e 2 , • • *, c s be the partial idempotent elements of x and let their ranks 
be ni, n 2j • • •, n s ; x is then composed of blocks of the form 


0 

1 

0 ■ 

• • 0 

0 

0 

1 ■ 

• ■ 0 

0 

o 

0 • 

■ • 0 

o 

o 

0 • ■ 

• 1 

0 

0 

0 ■ • 

• 0 


(■//» rows and columns) 


provided the fundamental basis is suitably chosen. To simplify the notation 
further we divide the array of n 2 units e*/ into smaller arrays formed by sepa- 
rating off the first m rows, then the next n 2 rows, and so on, and then making a 
similar division of the columns (see figure 1). And when this is done, we shall 
denote the units in the block in which the ith set of rows meets the jth set of 
columns by 

e l[ C hi = 1 . 2 , ••*, s;p = 1 , 2 , •••, m; q = 1 , 2 , • •, nj). 
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It is also convenient to put e* j = 0 for p > ru or q > n >. 


n i n 2 n 3 


w 2 
n 3 


11 

12 

13 


21 

22 

23 


31 

32 

33 







Fig. 1 


The expression for rr is now 

X 2-/ J ~ ^7 

i = 1 p = 1 t = l 

and we may set 

y = 2 rifai w 


s nj — 1 


»i n j 


where 

2/ O' = = T) ^ viW P {. 

p = 1 <7=1 

The equation .t?/ = y*r is then equivalent to 

(2) = 2/t^/ (t,i = 1, 2, • * *, 5). 

If we now suppress for the moment the superscripts i, j , which remain con- 
stant in a, single equation in (2), we may replace (2) by 


n i - 1 n i n j n i n j n j — 1 

X A \A X A NTA NTA XV 

/ , ('p, i> -| 1 / , / j / / / j Vltn^lm / j *’q,q 


V /</’/. q + 1* 


1 1 


or 

/» - 1 

/ = 1 t/l = 1 

l “ 1 7/1=1 

(3) 

n j 

V 
/_/ 
7/1 — 1 

n i - 1 

V 

2j Vi‘ -I 

P m 1 

t/11 

vT 

li 


Equating corresponding coefficients then gives 


( 4 ) Vp + 1 , q 4-1 Vpq- 

Since q > 1 on the right of (d), it follows that t\ v . f . i t \ = 0 (p = 1,2, ■ ■ 

7i i — 1) and, since p < m — 1 on the left, rj Uiq = 0 (q = 1,2, • • •, nj — 1) 
and hence from (4) 


(5) Vp 4- t , t 0 7} Jt j _ t, q — t 

where p = 0,1, • ■ w.» — t, q = t + 1, s + 2, • • •, nj — 1,1 = 0,1, 

From (4) we see that in all coordinates in an oblique line parallel to the 
main diagonal of the original array have the same value; from the first part 
of (5) those to the left of the oblique AB through the upper left hand corner 
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are zero, as are also those to the left of the oblique CD through the low ei light 
hand corner; the coordinates in the other obliques are arbitrary except, that., 
as already stated, the coordinates in the same oblique are equal by (4). 1 his 

state of affairs is made clearer by figure 2 where all coordinates are 0 except 
those in the shaded portion. 

A, 




Hi > Uj 


Fig. 2 


As an example of this take 


a 1 
a 


x = 


a 1 
a 1 
a 


a 1 
a 1 
a 1 


a 1 
a 


The above rules then give for y 


(6) 


a 0 

Ol 


ho 

h 



Co 

C | 

- 

do 



U 




Co 

d 0 

di 

eo 

ei 

e o 


• 

So fi 

k 

- 

do 



Cl 



. So 

k 

- 

• 

• 

• 

eo 

• 

• 


So 

go 

0i 

ho 

h 

h 

2*0 


2*2 h 

u 

. 

go 


ho 

h 


2*0 

2*i 2o 

2*3 


h Q 


2q 'll 12 

2*0 2*i 

. to 


where the dots represent 0. 
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li we aininge the notation so that n-i < n 2 < • • • < n 9 , a simple enumera- 
tion sliovvs that the number of independent parameters in y is 

(2s — l)ni + (2s — 3)n 2 + • • • + n a . 

Wo have tiierefore the following theorem which is due to Frobenius. 

Tiiwoumm 1 I. If the elementary divisors of x are (X - X* ) n u, i = 1, 2, • • •, r, 
j -= 1,2, * • •, Sj, w/icre Xi, X 2 , * • •, X r are a^ different and tin < n i2 < • • • < 

Z/ie// ////■ general form of a matrix commutative with x depends on 

r 

2 Zj + 1)^*7 

t = i y = i 

independent parameters. 

7.02 Commutative sets of matrices. The simple condition £?/ == y-'r may 
be replaced by the more stringent one that y is commutative with every matrix 
which is commutative with x. To begin with we shall merely assume that y 
is commutative with each of a particular set of partial idempotent elements 
a,s in the previous section we may assume that x has only one principal idem- 
potent element. 

in orchil* that epj — yd for every i it is necessary and sufficient that y l; - = 0 
when i j; if u h u 2) u 8 are the partial nilpotent elements of x corre- 
sponding to Cj, e 2) • • •, e B and we set ?m — ?ii — 1, this gives for y 

(^) V = 2) (VioOi + rjnUi + * • ■ + r} im .u m p). 

II we now put z = (PiCi + ui), where no pi = 0 7 and if g(\) is any scalar 

i 

polynomial, then (cf. §2.11) 

(J(z) = 2g(fiiCi + Hi) = 2(g(fii)ei + g , (0i)n i + ••• + g‘ mi ' (p^u^/nul) 
and when y is given, wc can always find c/(X) so that 

va = g ik) m/n. 

provided the /3’s are all different. Hence every y, including x itself, can he 
expressed as a polynomial in z. 

We now impose the more exacting condition that y is permutable with every 
matrix permutable with x. Let n t -y (i ^ j) be the matrix of the same form as 
Va in §7.01 but with zero coordinates everywhere except in the principal oblique; 
for example in (6) u 2 8 is obtained by putting / 0 = 1 and making every other 
coordinate 0. We then have 




u^ej, 


XliXLij 


UijUj. 
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Hence yu^ = u^y gives yuUa — uayn and therefore from (7) 

(rjitfii + ViiUi + • • • + = Uij{r} jo ej + vn u J + * ’ ' + 

— (rjjoGi + 'fJjiUi + * * * T~ Vj»>j U i J ) u ij 
from which we readily derive for all i, j and k 

Vi* = VPc 

with the understanding that tj »•& does not actually occur when k > m^. When 
x is the matrix used in deriving (6), these conditions give in place <>i (6) 

a Q cii 

. do 



do d\ 

do 






a 0 

dl 






. 

a 0 








do d! 

0% 

a 8 

dl 


. 

• 

do 

dl 

do 

da 


. 

• 

• 

d(\ 

dl 

do 


. 

• 



do 

dl 







do. 


Comparing this form with (7) we see that y is now a scalar polynomial in .r, 
which in the particular case given above becomes g(x — a) whore 

g(\) = ao + di\ + a^X 2 + -f" a,\W 

The results of this section may be summarized as follows. 

Theorem 2. Any matrix which is commutative , not only with x, but also with 
every mati'ix commutative with x, is a scalar polynomial in x. 

7.03 Rational methods. Since the solution of x y — yx = 0 for y (‘an be 
regarded as equivalent to solving a system of linear homogeneous equations, 
the solution should be expressible rationally in terms of suitably chosen param- 
eters; the method of §7.01, though elementary and direct, cannot therefore bo 
regarded as wholly satisfactory. The following discussion, which is due to 
Frobenius, avoids this difficulty but is correspondingly less explicit. 

As before let xy = yx and set a = X — x; also let b = L~ l aM~ l be the normal 
form of a. If u is an arbitrary polynomial in X and we set 


P = L~ l (au + y)L, 


Q = M(ua + y)M~ l 
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then 


Pb — PL x aM 1 = L 3 (cm + y)ciM~ l ~ L~ l a{ua + y)M~~ l = bQ. 
Conversely, if Pb = bQ and, using the division transformation, we set 

(9) LPL" 1 = av + y, M~ l QM = v x a + y u 
where y and y x are constants, then 

0 = Pb — bQ = L" l (av + y)aM~ l — L~ l a(via + yi)M~ l 

or a(v — Vi)a = ay x — ya. Here the degree on the left is at least 2 and on 
the right only 1 and hence by the usual argument both sides of the equation 
vanish. This gives 

ava — avia, ayi = ya 

whence v x = v and, since a = X — x, also y x = y so that xy = yx. 

Hence we can find all matrices commutative with x by finding all solutions of 

(10) Pb = bQ. 

Let a lt a 2 , ••*, a n be the invariant factors 1 of a and , •*•, n n the 
corresponding degrees so that b is the diagonal matrix Sa^/, and let P = 
IIA/II, Q = II Qn\\i then 

(11) P ijOij OLiQij. 

By the division transformation we may set 

Pij = PijOii + Pijj ()ij — SijOtj + (Jij 
and then from (10) we have 

R i j ~ ij , Pij&j == Ctityij 
ur, if p = || pa ||, q = || Qa ||, 

(12) pb = bq. 

Hence P = p, Q = q is a solution of (10) for which the degree of pij is less than 
that of at and the degree of (pj is less than t-liat of aj. It is then evident that, 
when the general solution p, q of (12) is found, then the general solution of 
(10) has the form 

P = bLi + p, Q = Rb + q 

where R is an arbitrary matric polynomial in X. We are however not con- 
cerned with R] for 

LPL"' = LbRL~~ l + LpL~ ] = aM~ l RL~ l + LpL~ l 
so that in (9) the value of y depends on p only. 


1 Since we may add a scalar to x we may clearly assume that the rank of a is n. 
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The general solution of (12) is given by 


(13) 

OLi 

Pi] — Sij, 

Pji — Sji 

(i > j ) 


7 ii = $ii) 

OLi 

Qji — Sji 

a.j 



where s X(l is an arbitrary polynomial whose degree is at most — 1 and which 
therefore depends on n M parameters. It follows that the total number of 
parameters in the value of y is that already given in §7.01. 

7.04 The direct product. We shall consider in this section some properties 
of the direct product which was defined in §5.10. 

Theorem 3. If fa ( i,j= 1, 2, •••, m) is a set of matrices, of order n, for 
which 

m 

fiifpq ~ Sjpfiq) y fit ~ h 

i = 1 

then m is a factor of n and any matrix of order n can he expressed uniquely in the 
form 2aufij where each a »•/ is commutative with every f pq ; and, if n = mr, the rank 
of each f pq is r. 

For, if a: is an arbitrary matrix and we set 

in 

( l5 ) a a = V) fkixfjk, 

' k = l 

a short calculation shows: 

(i) x ijf , 

(ii) (lijf/x/ f ixjfXij for all f, j , p . p ; 

(iii) the set 21 of all matrices of the form (15) is closed under the operations 
of addition and multiplication; 

(iv) if b n ,.b,->, ■ ■ • are members of 21, then 25, 7 / t7 is zero if, and only if, each 
luj = 0. 

II (a h a 2 , ■ ■ ■ , a{) is a basis of 21, it follows that 

(aJu- V = 1,2, ■■■, l;i,j = 1,2, - • , m) 

is equivalent to the basis (e ih i, j = 1, 2, • • •, n) of the set of matrices of order 
n. This basis contains hn- independent elements and hence n- = lm'~ so ihal, 
n = mr, l = r 2 . Let n, be the rank of /, 7 . Since /« = if follows from 

Theorem 8 of chapter I that r« < r Si ; also from fafu = f H we have r n < ,•»; 
hence = r, and therefore each r, 7 has the same value. Finally, since 
1 = S/if, and fafu = 0 (i ^ j ) and ru = r, 7 , we have viru = n and hence each 
rn = r. 
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We shall now show that a basis gij can be chosen for 21 which satisfies the 
relations. (1.4) with r in place of m. Since the rank of /„ is r, we can set 

r 

/« = 2) (i = 1, 2, • m) 

1 

where the sets of vectors (a ik ) and (fi a ) (i = 1, 2, ■■■, m; k = 1,2, ■■■, r) 

m 

each form a basis of the n-space since ^ /« = 1- If (fi\ k ) are the corre- 

l-l 

spending reciprocal sets and 

r 

V™ “ 2/ Pi kSa' ik (i = 1, 2, • • •, m) 

i 

we have, since Sa ik a' u = 5^5*,, 

= ^fiiZpu = 2 ^aikSftikp ijSa A — 2ai k Sot' ik = 1, 

and similarly 

fixPii y QiikSoi i/ c , fiiVij ~ 0 (t 7 ^ j'j . 

k 

Hence 

/*» = = aikSa' ik , 

k 

that is (3 ik = « •/,. 

Sin( ‘° /w = AoAy/yy, the left ground of / t7 is the same as that of and its right 
ground is the same as that of fjj. Let 


fu = SaikSyjk. 

r rhe vectors y jlc (k = 1, 2, • • ■ , r) then form a basis for the set a' jlc (k = 1, 2, 
' *> T ) and, si nce the basis chosen for this set in (16) is immaterial, we may 
suppose y ih = a'jk (j = 1, 2, • • • , m; A = 1, 2, • • •, r), that is, 

fij ~ OiikSa'j k . 

k 

Similarly we may set f n = 2a ik S9 lk and then since 


^ ; a i k ^ a i k fu fijfji ^ y ai k Sa ]k a j a S$i, — Hoc\ k S6i k , 
£ 

we have 01 * = and therefore 


A; , « 


/a = 2a,*<Saij 
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and finally 

(18) fij = filflj — ^ lk a U^ a i> = ^ a ik^ a ]lr 

k,S 

If we now set a* = Pe« - i )r + k, then by §1.09 P is non-singular and «' t 
= (P'y i e {i _ 1)r + 1 ) hence, if 


a 0 ) 

we have 

( 20 ) 


foi] 


r 


= 2 
fc = 1 


— l)r -j- it, 


U - 1 )r + hj 


fa = PhuP-K 


Also if 

( 21 ) 

then 

( 22 ) 


hi] 


-2 

a = 0 


@sr -f* t, sr +■ h 


hijhp + l,g-f.l 6pr + i, qr + j hp 4- 1, q 4- 1 hi: 


so that the set (e*y) of all matrices of order n may be regarded as the direct 
product of the sets Qui) and {ha). Finally, since any matrix can be expressed 
in the form 2b aha, where the 6<y depend on the basis (ha), it follows that an 
arbitrary matrix can also be expressed in the form 

PXbijhiiP - 1 = 2Pbi]P~ l fa] 

PbijP~ l depends on the basis (PIdjP~ l ) and hence, if we set 
9 a = PhijP ~ 1 (i, j = 1,2, • • r) 

the g’s form a basis of 31 which satisfies (14). 


7.05 Functions of commutative matrices. Let x and y be commulative 
matrices whose distinct roots are \i, X 2 , ••• and fx \, n>, respectively and 
let Hi be the principal idempotent unit of x corresponding to A* and similarly 
Sj the principal idempotent unit of y corresponding to fxj. Since A\- and S } - 
are scalar polynomials in x and y, they are commutative. If we set 

T a = RiSjj 

those Tn which are not 0 are linearly independent; for if SfaTij = 0, then 

0 = RpZfaTijSq = £ Pt T pq , 

since RpRi == 8piPp) SjSq djqSq^ so that either i^pq == 0 or Ppq ^ 0. 
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From the definition of it follows that Ti-jT pq — 0 when zV p or jV q, 
and T} i = T i} , ST 7 *-/ = 1; hence 

x = z ! \j ^ ( x “ ^*)]2 rT i/, y = ^ [gy + (2/ — 

i - j * - / 

where (.r — A;) Thy and (y — Hj)Tu are nilpotent. If \p(\, g) is any scalar poly- 
nomial then 


= 1 p(\i, Hi) + ^ri(X — Xd r (g — gy) s 

r , &■ 

where ^ r « are scalars, we have therefore 

g-T, ?/) = 2/ + 2/ - >W(2/ - /'y)*2 T i/ 




where 


= (a? - A,)'(y - 

and r runs from 1 to pi — 1, where p t * is the smallest integer for which 
(.r — \i) p Jti = 0, and $ has a similar range with respect to y. The matrices 
T • ) are commutative and each is nilpotent; and hence any linear combination 
of ihem is also nilpotent. 

Lot 

S = )Tii, W = 


then w, being the sum of commutative nilpotent matrices, is nilpotent. If we 
take in z only terms for which 7\-y ^ 0 , we see immediately that the roots of z 
are the corresponding coefficients ^(A g gd ; and the reduced characteristic 
function of z is found as in §2.12. We have therefore the following theorem 
which is due to Frobenius. 


Tiikouiom 4. If R iy ( 1 = 1 , 2 , • • • ; j = 1, 2, • • •) are the principal idem- 
poind units of the commutative 'matrices x y y and T a — R S y ; and if A *, gy arc /Ac 
corresponding roots of x and y, 'respectively; then the roots of any scalar function 
^(x, y) of x and y are f(\ iy gy) where i and j take only those values for which T a 7 ^ 0. 

This theorem extends immediately to any number of commutative matrices. 


7.00 Sylvester’s identities. It was shown in §2.08 that, if the roots of x 
are all distinct, the only matrices commutative with it are scalar polynomials 
in x; and in doing so certain identities, due to Sylvester, were derived. We 
shall now consider these identities in more detail. 

We have already seen that in 

/(A) = | A — £ | = A n + aiA n ” 1 + * * * + a n _ {K + 
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the coefficient a r of \ n ~ 7 is (— l) r times the sum of the principal minors of x 
of order r; these coefficients are therefore homogeneous polynomials of degree r 
in the coordinates of x. We shall now denote ( — i) r a r by [*]. If x is replaced 
by \x + jjLi/, then [*] can be expressed as a homogeneous polynomial in A, fx 
of degree r, and we shall write 


(23) 


+ ny 
r 


— r 

]-? 


A *fi r 


We shall further set, as in §2.08, 

(24) (Xx + wY = J) ? T S W~‘ 

s = 0 v ' 

where { “ is obtained by multiplying s x’ s and t y’s together in every possibles 
way and adding the terms so obtained. 

In this notation the characteristic equation of \x + \±y is 


(25) 


0 = 


r = 0 


(Xx + ny) n r 


x y 

t — s n + 5 — r — t 

where in the second summation [* v q ] or {* v q } is to be replaced by 0 if <m ( lu>r 
V or r/ • is negative and [J J] = 1. Since A is an independent, variable, (he 
coefficients of its various powers in (25) are identically 0, and therefore 




(26) 



V } 

n + s — r — t) 


= 0 


(t = 0, 1, n) 


a series of identical relations connecting two arbitrary matrices. 

These identities can be generalized immediately. If aa, ;r 2 , 
matrices and A h A 2 , • * scalar variables, we may write 


(27) 


2A a 
r 


i 




XI 

x 2 *■ 

Xm 

xr» 

x 2 rj •• 

. \ r *n 

A m 

n 

T2 * 

r 7 u_ 




Xi 

x 2 * ‘ 


xr 

X 2 r ’ •• 

■ Kr 

n 

n • 

rj 







and by the same reasoning as before we have 


x m are any 


r) 


(28) 2 2 <-»' 


A x 1 X 2 • * * Xm 


r r t • • ■ r m 


Xi 


X-2 


n r 2 • • • [si T\ s 2 — n 

m \ 

2 -- 


r m j 
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where s h s 2 , s M is any partition of n, zero parts included, and as before 
a bracket symbol is 0 when any exponent is negative. 


Since 



is the sum of the principal minors of SXiXi of order r, we see 


thnX 


Xi x 2 ‘ * • Xm 

J'\ n • * • r,n 


r) is formed as follows. Take any principal minor 


of x { of order r and the corresponding minors of x 2 , x 3} • • •, x m and replace r 2 
of its columns by the corresponding columns of z 2 , then replace r 3 of the remain- 
ing columns by the corresponding ones of x 3 , and so on; do this in every possible 
way for each of the minors of order r of x\ and add all the terms so obtained. 

There is it great variety of relations connecting the scalar functions defined 
above, a few of which we note here for convenience. 


0) 



n ! 

”=T)T 




(n — r)i 
s!(n — r — s)\ 


x 

r 


(r + $)! x 
r!$! [_r + s 


(ii) The 
the .r’s. 


value of 


X l X 2 ’ * ’ Xm 

r 


is unchanged by a cyclic permutation of 




ji r 2 • 

• r m _ 

ii (V.) o 

- Sr,)!’ 


0*1 X 2 • • 

• X m 1 


(« - 2r<)! 

Jxi Xo • 

Xm 

J\ r 2 • ■ 

• r m S_ 

s!(n — s — Sn 

! In r 2 * 

• • r m 


X X 

* • X yi • * • 

Vp~\ _ (2r<)! 

fx yi • • 

• Vp 


J'\ r» 


#J 11 (?-,!) 

L Sri «i • • 

* s p~ 


X 

"?/l ?/2 ’ 

’ Vn 

— v 

TUi, xy i% * 

* Wir y<r » , 

• • • Via 

r __ 

_1 1 * 

• • 1 _ 


_ 1 1 • ■ 

- 1 1 

. . . 1 _ 


where the summation extends over the n\/r\(n — r ) ! ways of choosing r integers 
out of 1, 2, •••, //, the order being immaterial. 


7.07 Similar matrices. In addition to the identities discussed in the pre- 
ceding section Sylvester gave another type, a modification of which we shall 
now discuss. If x, y, a are arbitrary matrices, we have 

(29) x r + l a — aif + 1 = x (x r a + x r ~ l ay + x r ~ 2 ay 2 + • • * + oy r ) 

— (x r a + x r ~ l ay + x r 2 ay 2 + * * • + ay r )y 


or say 


x r - 1 - 1 a - ay r + 1 = z(x, a, y) T - (x, a, y) r y 
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where 

r 

(30) (x, a, y) r = ^ x r ~ i ay\ 

£ = 0 

Suppose now that x and y satisfy the same equation /(X) = 0 where 
/(X) = \ m + &iX m “ 1 + • • • + a m _ iX + a m 
x and y being commutative with each a* and a commutative with every Let 

m— 1 

(31) M = 2/ a ^ X ’ a ’ 1,' 

0 

then 

(32) 0 = f(x)a — a/Q/) = 2a,i(x m ~ i a — ay m ~ l ') = — w/. 

If | u | ^ 0, it follows that y = u~ l xu, that is, x and y are similar. 

It can be shown that a can be chosen so that | u | 0 provided x and y have 

the same invariant factors and /(X) is the reduced characteristic function. 
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8.01 Matric polynomials. The form of a polynomial in a matrix has already 
been discussed in §2.11 but we repeat the principal formulas here for con- 
venience. If re is a matrix whose reduced characteristic function is 

r 

(1) <p( X) = (X - Xi)-*', 2 = V, 

x = 1 

and 


mi(A) = cp{ A) /(A — A i) H , Mi(\)mi(\) + (A — Ai)"Wi(A) = 1, 

(2) ^i(A) = 

(3) <Pi (pi(pc) ) hi (x 
and if g( A) is a scalar polynomial in A, then 


( 4 ) 


9 (?) = 2 


g(\i)<pi + g'(\i)hi + • • * + 


r,(vi — 1 ) 


(mv 


(* - i)i 


i“ 


r rhis formula can still be interpreted when the coefficients of g( A) are matrices, 
but in this case the notation g(x) is ambiguous. Let g( A) = « 0 + * 

+ a m A m ; then 

a 0 + ciix + • • ■ + a m x m and a 0 + xa L + • * • + x m a m 

are called, respectively, the dextro - and lacvo-latcral polynomials corresponding 
to g{ A). It is clear that (4) holds for a dextro-lateral polynomial and will 
give the corresponding laevo-latcral polynomial if g{\i)*pu etc., are 

replaced by ^»f/(A»), hig'(Xi) } etc. 


8.02 Infinite series. If a 0 , a h • * • are matrices and A a scalar variable, the 
coordinates of the matrix 

(5) g( A) = a o + aA + a 2 A 2 + * • • 

are scalar infinite series in A; and if each of these series converges for mod A less 
than p, we say that the series (5) converges. When this condition is satisfied, 
the series 

g(x) = a 0 + aix + + * * * 
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converges for any matrix x for which the absolute value of the root of greatest 
absolute value is less than p. For if g m is the sum of the first m terms of (61, 
then by (4) g m = ^ g mi where 

i 

n (*'i “ 1 .\h f ! i i 

9 mi = g m (\i)cpi + + ■ * ’ + — 7 T\Y * 

— 1 )! 

The matrices <pi, hi are independent of m and, since the absolute value of each 
\i is less than p, g m (ki) } g^Qn), • • •, g m (9i “ n (A») converge to gfa), • • •, 

g (H ~ when m approaches infinity. 

As an illustration of such a series we may define exp x and log (I | x) by 

(7) eX p x = e* = 1 + j + ~ + ■ • • 

(8) log (1 + x) = x - ^ ^ + ■ • ■ 

The first of these converges for every matrix x, the second for matrices all of 
whose roots are less than 1 in absolute value. 

rhe usual rules for adding series and for multiplying series whose coefficients 
are commutative with x and with each other hold for matric series. For 
instance we can show by the ordinary proof that, if xy = yx, then t x 1 " - 
but this will not usually be the case if xy ^ yx. 

8.03 The canonical form of a function. In the case of multiform functions 
(4) docs not always give the most general determination of the fund ion 
which is oidy obtained by taking into account the partial as well as I lie princi- 
pal elements of the variable x. As in §3.06 suppose that x has the canonical 
form 

CLl 

0>2 

(9) * = 


II «r 

where a',- is a block of terms 

Xi 1 0 ••• 0 0 

0 Xi 1 - ■ • o 0 


( 10 ) 


0 0 0 • • • Xi 1 

0 0 0 ••• 0 Xi 


(ri rows and columns). 


[s.o:>, j 


CANONICAL FORM OF A FUNCTION 


117 


It is convenient to let a* stand also for the matrix derived from (9) by replacing 
( 'Vci\y <t j by 0 except when j = i. We can then write 

0 hi x — Xciij didj = 0 (i ^ j) 

find wo may set 

( 1 di == \%6% ~f” Z% 


wlHM’c (of. §3.07) e\ = e h 
y,e t - 1. 


€ {Z ' i — Z i 


Zi is a nilpotent matrix of index n, and 

= Zi&i, eiCj = 0 , eiZj = 0 = Zj(\ 


O' ^ i). 


r Fhe pari, of s* which is not 0 is given by the oblique line of Ts in (10); z 1 is 
obtained by moving all the l’s one place to the right except the last which dis- 
appears, and in general z m ~ 1 has a line of l’s starting in the mth column of 
(10) and running parallel to the main diagonal till it meets the boundary of 
the block. 

If is now easy to see the form of a scalar polynomial g(x) or of a convergent 
power series with scalar coefficients; for 


(Id) g(x) =~ N g(di)ei — 2 r< 7 (Xi)ei + g '(A;)z.; + •• • + ~ — 

L fa- 1)! J 

and I ho block of terms in g(x) which corresponds to di in (10) is, omitting the 
subscripts for clearness, 


ffOO !7'(X) 

9" 00 
2! 

g'"( x) 

3! 

g (T 

V 

“ U (A) 
“ 1)! 

o(X) 

9 '00 

g"( A) 

2! 

g {r 

\r 

~ 21 (A) 

-’) ! 


gW 

g'( A) • 

0 (r 

> 

" 31 (A) 
-3)! 


* 0'(A) 

g(S) 

where all the terms to the left of the main diagonal are 0, the coordinates in 
the first row are as indicated, and all those on a line parallel to the main diagonal 
are the same as the one where this line meets the first row. 

If the characteristic function is the same as the reduced function, no two 
blocks of terms in (9) correspond to the same root and e;, are the principal 
idcmpofcnt and nilpotent elements of x corresponding to A* and (13) is the 
same as (4). This is not the case when the same root occurs in more than one 
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of the blocks (10) and, when this is so, the a* are not necessarily uniquely deter- 
mined. For instance let 




7 

0 

0 

(15) 

X — 

0 

7 

1 



0 

0 

7 


Here we have 


cci = yen, a 2 = 7(022 + £ 33 ) + £ 23 ? 

£i — 6n, e 2 === c*22 ~h £33* Zi == 0, z 2 = £23* 

But if 

fn = 6n — €13 
/21 = 621 — £2.3 
/31 — £31 — €13 + e n — £33, 
the/ J s form a set of matric units and 

x = 7/11 + 7C/22 + fn) + fn 


/12 — £12 

/oo = <?22 

/32 — ^32 + Ci2, 


f 13 ~ #13 
flZ == £23 
fz 3 ~ £,'13 + (‘13, 


so that we might have chosen /1 = f n , f 2 = f 2 2 + fn = £22 + £33 + Os as idem- 
potent elements in place of e x and e 2 . 

It should be carefully noted that/i, f 2 are not commutative with e h c 2 and 
in consequence different determinations of a multiform function may not be 
commutative with each other. For instance, if x is the matrix given in (15) 
with 7^0, and 7* is a particular determination of the square roof of 7, we have 
already seen in §2.13 that determinations of x* are given by 


ih = 7 %n — 7 i (<°22 + £33) — £23/27* 

u 2 = yfn — 7 K /22 + fn) ~ fa/ 2 y* 

— ^*(£11 — C13) — 7*^22 + £33 + £13) ~ £23/27* 
= ih — 27*013, 


and these two values of x* are not commutative. 


8.04 Roots of 0 and 1. The reduced equation of a nilpofenf matrix of index 
m is x m = 0 and this matrix can therefore be defined as a primitive w.fh roof 
of 0; the index m cannot be greater than n and it exceeds 1 unless x = 0. The 
canonical form of x must contain at least one block of order n = m, similar 
to (10) but with X* = 0, and a number of like blocks of orders, say, n (i = 

2, 3, • ■ •) where 7 \ < r x and ^ r { = n . This gives rise to a series of distinct 

1 

types in number equal to the number of partitions of n ~ m into parts no one 
of which exceeds m, and £ is a primitive mth root if, and only if, it is similar 
to one of these types. 
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If £ is a primitive mth root of 1, its reduced characteristic function is a 
factor of \ m — 1 and hence x has simple elementary divisors. Let e be a scalar 
primitive mth root of 1, and let f u f 2 , •••,/, be idempotent matrices of ranks 

t — 1 

r h r 'h * * * f° r which fifj = 0 (i j), Zfi = 1; for instance, if pi = ^ 77, we 
may set 7 

Pi + l 

f * = ^ y fyj (i = 1; 2, • • • , s; + 1 — u). 
p^+ 1 

The canonical form for s is then 

(16) €*'/l + + • * • + ^ S fs 

where the exponents £* are all different modulo m and at least one e t{ , say the 
first, is primitive. Any primitive mth root of 1 is then similar to a matrix of 
the form (16), and conversely. 

8.05 The equation y m = x; algebraic functions. Let Xi, X 2 , X 3 be 
the distinct roots of x and m = \\ /m a particular determination of the mth root 
of Xi for i — 1,2, •••,«; then, if y m — x, the roots of y are all of the form 
where e is a primitive scalar mth root of 1. Suppose that the roots of y are 

Mil = Mi, M12 = € <12 M 1, • *, Ml r, = € <lri Mi, 

(17) 

M.S -1 = Ms, M.s '2 = ^"'M«, **•, Mar, = €** r *M«, 

and let a, pa r l icula r choice of the partial idempotent and nilpotent elements 
corresponding to mo he J ijk and h iik (k = 1, 2, • • •),* also let the index of link 
be n i j k . r rhen 

(IS) y = VtjxiiUik + link) 

and hence 

(10) ir = ^(mo/o/c + //OA-) m = S(X f -/ >7 * + <;<*) 

where (jij k is the nilpotent matrix 

(20) (hjk = (pijfijk + hijk) m ~ mTj/oa- 

Further, if X»- 0, (20) can be solved for /i,-;* as a polynomial in g^k) for we 

can write (20) in the form 

(21) Qij/c ~ ajlijlc + « 2 A iy/ c + 

and, since a\ = 5^ 0, the ordinary process for inverting a power series 

shows that we can satisfy (21) by a series of the form 

(22) hah = Pi Qijk + $ 2 Q 2 ijk + * * *, (Pi 3^ 0), 
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there being here no question of convergence since any power series in a nilpotent 
matrix terminates. It follows from (21) and (22) that the indices ol g/g- and 
hiji; are the same. 

We shall now show that the matrices/*/* and <7*7* form a set of partial idem- 
potent and nilpotent elements of x provided always that x is not singular. 11 
this were not so, then/*/* must be the sum of two or more partial idempotent 
elements; for the sake of brevity we shall assume that it is the sum ol two 
since the proof proceeds in exactly the same way if more components are taken. 
Let fijh = ch + d 2 where di and d 2 are partial idempotent elements ol x arid 
let Ci, c 2 be the corresponding nilpotent elements; then 

ffijk = Ci + C 2 , CiC 2 = 0 = Cod. 

Hence also A*-/* = bi + & 2 , hh — 0 = bf ) i where b a is obtained by putting c tY 
for gifk in (22); and this is impossible since we assumed that /,•/* and A///, ; were 
partial idempotent and nilpotent elements of y. We have therefore the (ol- 
io wing theorem. 

Theorem 1. If x is a non-singular matrix , any determination of y - .r I/w 
can be obtained by expressing x in terms of partial idempotent and ml potent de- 
ments , say x = 2(X*/* + gf) and putting 

V = S(Xf/i + 0i) Um = SXl /M (/i + ~ 1 ) W > ■■■)■ 

Here the binomial series terminates and \\ ,m is a determination of the milt root of 
X i which may be different for different terms of the summation if this root occurs 
with more than one partial element. 

There is thus a two-sided multiplicity of rath roots of ;r; the X' have m 
possible determinations in each term and also there is in certain cases an infinity 
of ways of choosing the set of partial elements. Since the canonical form is 
independent of the actual choice of the set of partial elements out of the possible 
sets, any choice of such a set can be derived from any other such set by trans- 
forming it by a matrix u; and since x itself is the same no matter what set of 
partial elements is chosen, we have uxu~ l = x, that is, u is commutative wit h x. 
It follows from the development given in §§7.01,7.02 and 7.04 that. a. mat rix u 
which is commutative with every partial idempotent element is a polynomial in x. 

8.0G We must now consider the case in which x is singular and in doing so it 
is sufficient to discuss rath roots of a nilpotent matrix; for the principal idem- 
potent element of x which corresponds to a root n is the sum of those principal 
idempotent elements of y which correspond to those roots whose rath power 
is g, so that the principal idempotent element corresponding to the root 0 is 
the same for both x and y. Let the elementary divisors of y be X"\ X"* 2 , ■ * * , \ mp ; 
then 


V = yi + 2/2 + * * * + Up 
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where is a nilpotent matrix of index m., and we may suppose the fundamental 
basis so chosen that the significant part of y { is 

0 1 
0 1 

(23) • . • . (m,- rows and columns). 

0 1 
0 

To simplify the notation we shall consider for the moment only one part y, 
and replace it by y and by n so that y n = 0 and 

n— 1 

2 / = 2 eiSei + l - 

1 

If we now form the mth power of y , then y m = 0 if m > n and if m < n 

n — m 

y m = 2 eiSei + m • 

1 

If we define r and Jc by 

(24) (r — 1 )rn + k = n < rm ( k > 0) 

then r > 2 and 

(25) y m Ci :== 0, y m C% -f. m Ci) y m Gi -f- 2m = + W!) * * * > y m @i + (r — 1) m = -)- (r — 2) m 

a = i, 2, k) 

o'ivin^' k chains of order r of invariant vectors, and similarly for i = k + 1, 
k + 2, • • • , m, we have m — k chains whose order is r — 1 since for these values 
of i the last equation in (25) is missing. If we set u and v for blocks of terms 
like (2M) only with r and r — 1 rows and columns, respectively, then we can 
find a non-singular matrix P which permutes the rows and columns in y m so that 


(20) 



u 


u 

p-l ym p = 

u 


V 


V 


(Jc u 1 s and m — k Ps). 


We are now in a position to consider the solution of y m = x where x is a 
nilpotent matrix of index r. In the elementary divisors of x suppose pi expo- 
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nents equal r, p 2 equal r — 1, and in general p, equal r — j + 1 (j — 1 > 2, ■■■,>); 
here the p’s are integers equal to or greater than 0 such that 

1 )Pi = n, Pi ^ 0. 

7 

The maximum possible exponent for any elementary divisor of y is ra; let 
q H (i = 1, 2, • ■ m; j = 1,2, • ■ r) be the number of exponents which equal 

(r ~ j + 1 )m — i + 1 = (r — j)m + (m — i + 1). 

Forming y m and using (24) and the results of (25) we then see that. 

(27) #7-1,2 + 2^—1, 3 +•••-(- (m — l)#y~ i,m 4" 4" ^ ) ( U~ 

4" * * * 4~ Qjm = Vi (J = 1; 2, * ■ r; q { yi = 0) 

2) [(r - j + l)m - i + 1] cja = n. 

* / 7 

These relations form a set of Diophantine equations for the #’s. When a set 
of q ’ s have been found, we can find the matrix P (cf. (26)) for each part of y m 
and then set y = 2R~ x yiR where R has the form SPiQi, Qi being commutative 
with P^y^Pi and so chosen that R is not singular. 


8.07 The exponential and logarithmic functions. The function exp y =■- t !/ 
has already been defined in §8.02 by the series 

00 

* y = 1 + 2 ) y m / >n ' 

i 

or in §8.03 in terms of the partial units of ?/. Let the distinct rools of // be 
n i, g 2 , • * *, y s and let a choice of the partial idempotent and nilpotent (4 <m n< k n ( s 
corresponding to vi be f ij} ha ( j = 1, 2, • • •, hi) so that 

la ki 

(28) 2 = 2 fn, hi = 2) hi, (i =1,2, • • •, .S) 

3 = 1 3=1 

are the principal idempotent and nilpotent elements of y. If we sot, .r — t \ 
we have 


x = e y 


22 


(29) 




= S 2 (e V.' + gu) 


where v {j is the index of lia and 


(30) 


7 2 

9a = Ay + 4 + • • ■ + Ay ii-1 / (r,-; - 1)!. 
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The index of Qa is clearly vy . Solving (30) for h ih we have the formal solution 
hij — log (1 + g i} ) and on using this or inverting the power series in (30) we get 

( 31 ) hi = On ~ \Q% + ■ • * + (- 1 ) - 1 ). 

As in §8.05 it follows that g i3 - form a set of partial elements for x and, when x 
is given so that y = log x, the method there used gives the following theorem. 

Theorem 2. If x is a non-singular matrix whose distinct roots are Xi, X 2 , 
• • •, \ r , and if log Xi, log X 2 , • • •, log X r are 'particular determinations of the 
logarithms of these roots, then the general determination of logo; is found as follows. 
Take any set of partial elements of x, say fa , gy (i = 1,2, • • • , r; j = 1, 2, • ■ • , hi) 
where fa, g correspond to Xi and the index of ga is va, let hij be the nilpotent 
matrix defined by (31), and let ki, be any integers , then 

(32) log £ = 22 [(log Xi + kijTT\/^i)fij + h i3 ], 

i j 

The discussion of the relation between different determinations of log x is 
practically the same as for x l/m and need not be repeated. 

If fi and hi are defined by (28), a particular determination of log x is given by 

(33) Log x = 2) Xi + kory/ — \)fi + hi], 

i 

This form of log x has the same principal elements as x provided log Xi + k% 
9^ log Xy + kj for any i 9^ j , and even when this condition is not satisfied, it is 
convenient to refer to (33) as a principal determination of log x. This deter- 
mination is the one given by the series (cf. §8.02 (8)) 

(84) log x = (x - l) “ l(x - l) 2 + l(x - If - • • • 

provided each ki is 0 and the principal determination of log Xi is used. The 
series converges only when the roots of x — l are all less than l in absolute value. 

S.0S The canonical form of a matrix in a given field. If the coefficients of a 
matrix are restricted to lie in a given field of rationality, the canonical form 
used in the preceding sections requires some modification. The definition of 
the invariant factors is rational as are also the theorems regarding similar 
matrices which were derived from them in Chapter 3; and hence if A r and x 
are rational matrices which have the same invariant factors there exists a rational 
mat rix T for which P~ l xP — X. The definition of elementary divisors requires 
only the natural alteration of substituting powers of irreducible polynomials 
for (X — Xi) v a\ 

Let 


a(X) = X m + a x \ m " 1 + • • • + a m 



FUNCTIONS OF MATRICES 


124 


[ vm ] 


be a scalar polynomial in a field F which is irreducible in F; then the matrix 
of order m defined by 


(35) 


CL\ — 0 , 2 

1 o 


0 0 
0 0 


0 0 


1 0 


has a(\) as its characteristic function; since a(\) is irreducible in F, it follows 
immediately that x a is an irreducible matrix in F and that a(X) is also the 
reduced characteristic function. It is easily seen that the invariant factors of 
\ — x are given by m — 1 units followed by a(\). 

Again, if we consider 


(36) 


Xa lm 

Xa 1 m 



X a 


(r rows and columns) 


which is a matrix of order rm, we see as in §8.03 (14) that, if g{\) is a scalar 
polynomial in X, 


ff(Xa) g'(x a ) ■■■ g !r ~ »(* 0 )/(r - 1 ) ! 
g(x a ) ••• g<T - 2) (*„)/(?• — 2) ! 

g(x a ) = 

g'(x a) 

g(xa). 

It follows that, if g(x a ) = 0, we must have g<- r ~ l) {x a ) = 0 and therefore «(X) 
is a factor of g (r ~ n (X) so that [a(X)] r is a factor of g(X). But if wo put g(\) 
= [«(X)] r the first (r - 1) derivatives of g(\) have «(\) as a factor- and so 
vanish when X is replaced by x a ; hence g(x a ) = 0. It follows that the* reduced 
characteristic function of x a is [a(X)] r and, since the degree of this polynomial 
equals the order rm of x a , it is also the characteristic function so that tin; 
invariant factors of x a are given by 1 repeated rm - 1 times followed by |«(X)] r 
The argument used in §3.06 then gives the following theorem. 


Theorem 3. Let au(X), a 2 (X), •••, a*(X) be polynomials, not necessarily dis- 
tinct, which are rational and irreducible in a field F and whose degrees arc m h 
m 2 ’ m * respectively; and let r 1? r s , • ■ • , r k be any positive integers such that 
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U m -i — n; then , if x a . is the matrix of order r im% formed from a z -(X) in the same 

i 

way as x a in (36) is formed from a(X) f the matrix of order n defined by 


(37) 


x = 


has ^(X), * * •, a/ c (X) as its elementary divisors in F. 

If X is any matrix with the same elementary divisors as x } it follows from 
§3.04 theorem 5 that we can find a rational nonsingular matrix P such that 
I XP 1 = x. We may therefore take (37) as a canonical form for a matrix 
in the given field F. 

8.09 The absolute value of a matrix. The absolute value of a matrix a = 1 1 a pq 1 1 
is most conveniently defined as 

(38) 


^ ^ ttpqdpq 

\p,q = l ) 


where the heavy bars are used to distinguish between the absolute value and 
the determinant | a |. It must be carefully noted that the absolute value of a 
scalar matrix X is not the same as the ordinary absolute value or modulus of X, 
the relation between them being 

(39) | X | = f^mod X. 

It follows immediately from (37) that 

(40) mod (| a | - | b |) < | a + b | < | a | + | b |; 
and from 

y dprdpr /^y b nqbsq y 2 y (U/ir^.sg & jmb r q) (d p r b s q dj,Jj rq ) 

Pi r q s p,q L r , s 

+ y Uprbrq ^ y d p J) S q 
r s 

we have 

(41) \ab\<\a\\b\. 

Since the trace of ad' is '2a pq d pq , the absolute value of a might also have 
been defined by 

(42) | a | 2 = tr ad f — tr a'a. 
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From this we see immediately that, if a is unitary, that is, ad' = 1, then 

(43) | a | = #, | ab | = | b | 

where b is any matrix. 

No matter what matrix a is, aa' is a positive hermitian matrix, semi-definite 
or definite according as a is or is not singular; the roots <71, g n of ad' 

are therefore real and not negative. If we set 

p r = 2gig 2 - • • g rj p = p n = (mod | a |) 2 , $ = pi = I a | 2 = 2y i9 

then 

T r Pr < S(^i + #2 + • * * + g r ) r < 
whence 1 

(44) (fjp rln < Pr < r~ r {^ _ J)s r . 

If C r {a) is the rth supplementary compound of a (cf. §5.03), and a = p* 
is put for mod | a |, then p r — | C r (a) | 2 and we may write for (44) 

< I C'(a) I* < “ J)|a|*' 

and, since (a 7 )- 1 = C n ~~ l {a)/\ a |, we have 

_ n — 2 

(45) n*a~ 1,n < | a~ l | < (n — 1) 2 | a | n “ l /a 

provided | a | ^ 0. This inequality enables us to deal with expressions involving 
negative powers of a. 

Since ci~ l = exp (— log a), we also have 

| a~ l b | = | (1 — log a + iiflog a) 2 ~ • • -)b | < (1 + | log a | 

+ h I log a | 2 + • • •) | b 

and therefore 

(46) | a~ l b | < € 1 log a 1 1 b |. 

Putting 6 = 1 we also have 

(47) | a~ l | < n* — 1 + e • loB a I and | a“ l | < nh I luK 0 1. 

It is also sometimes convenient to note as a consequence of (41) with b = ar l 
that 

(48) U" 1 1 > n* | a h 1 . 

1 n r * n i (44) gives Hadamard’s expression for the maximum value of mod I a I. 
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8.10 Infinite products. As an illustration of the use of the preceding section 
we shall now investigate briefly the convergence of an infinite product. Let 

(49) P m = (1 + <h)(i + a 2 ) ••• (1+0 

Q™ — (1 + | ai |)(1 + | as |) • • • (1 + | a m |), 
then, if c is an arbitrary matrix, 

(50) \Pm~ l \ < Qm - 1 < € S|0<I - 1, 

(51) |p,c| < 

m »» 

(52) | P m - P k | < Q m - Q k < ^ 

k 

For on expanding P m we have 

1 m 1 “f” ^ P + y G'jjQ'q + CtpClqClr “I - * * * 

V P< q p<q< r 

therefore 

I P. m — 1 | < I a P I “I" 2) | Up I | Uq I + * * * = Qm ~~ 1 

V 2><Q 

\P m c\ <0+2 KI+SKIKI+ ”■) |c| = Qm|c| < e s '“-'|4 

v p<q 

r Fhe proof of (52) follows in the same manner. 

Hence P m converges when Q nl does, for which it is sufficient that 2a* is abso- 
lutely convergent in the sense that 2|a<| converges. 

3. 1 1 The absolute value of a tensor. If w is a tensor of grade r, we define the 
absolute value of w by 2 * 

(53) mod w = (r ISww) 

We shall for the most part consider only vectors of grade 1 as the extension to 
tensors of higher grade is usually immediate. 

If x and y are tmy vectors, we derive from (53) 

(54) mod (x + y) <S mod x + mod y, mod Sxy < mod x mod ?/. 

If A is a matrix, 

(mod +Lr) 2 == SAxAx — SxA'Ax. 

By §6.02 

A' A = ZgidiScti 

2 The r! enters here only because of the numerical factor introduced in defining Suv 
(of. §5.16). 
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where the g’s are real and not negative, and Sdidj = <5;/ so that mod cu — 1; 
hence 

SxA'Ax = 2giSxdiSdiX = 2gi (mod Saa) 2 

< (mod a* mod x) 2 = (S</;)(mod :r) 2 
= i A | 2 (mod a;) 2 , 
or 

(55) mod Ax < \A\ mod x. 

From (54) we then have 

(56) mod SyAx < | A | mod x mod y. 


8.12 Matric functions of a scalar variable. If the coordinates of a- matrix 
a(t) = || a pq (t) || are functions of a scalar variable t , the matrix itself is called 
a matric function of L The derivative, when it exists, is defined as 

d(lpq 

dt 

h being a scalar. The fundamental rules of differentiation are 


(57) 


da 

dt 


lim 

a— > o 


a(t + h) — a(t) 
h 


d(a + b) _ da db d(ab) __ da , db 

dt dt + dt’ ~dt dt ° + a dt ’ 

to which we may add, when | a | ^ 0, 


(W = (da V 
dt \(lt / 


(58) 

Other examples are 


da ~ 1 

5T~ 



a“ l . 


da 2 

dt 


da 

dt 


a -j- a 


da 

Tit’ 


da 3 da 0 , da , (> do. 
— = — a- + a — a + ar — 

eft dJ <U 


and in general, if m is any positive integer, 

fSOl da m __ r a daj d^l 

K It \m - 1 1 J * 

Under the usual conditions each of the coordinates of a(t) is expansible as a, 
Taylor series and this is therefore also true of a(t). If f(t) is a scalar function, 
f(a) may or may not have a meaning. For instance, if f(t) can be expanded in 
a power series which converges for mod t < a , then the same power series 3 in 

3 If g{t) = 2u n t n , u n scalar, the series intended here is 2 u n a n . Other definitions are 
possible, e.g., if we set 

G{a) = 2 «» 2 c i’o ac{ il ac [’z ■ ■ ■ - ,oc(“ n > 

n i 

where c Vo c Vi ‘ * ' c( “n = b we still have 6 y (0 = g(l). 
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a converges when | a | < a; but if f(t) is defined by a Fourier series which is 
not differentiable, f(a) will not have a meaning when the elementary divisors 
of a are not simple, as is seen immediately on referring to the form of §8.03 (14). 
If f(a) and f f (a) have a meaning and if da/dt is commutative with a, we have 
df(a)/dt = f'(a)da/dt. For instance, if x is a constant matrix and a = t — x, 
then 

(60) d Iqs ~~ g ) = I 


as is also easily proved directly. 

1 he integral of a(t) is defined as follows. If C is a regular contour in the 
2-plane, we shall set 


(61) 



l 


a pq (t)dt 


or if t h t‘ 2 j • • • is a series of points on C and t\ a point on the arc (U, ti + i), and 
if the number of points is increased indefinitely in such a way that mod 
(U -|- 1 — ti) approaches 0 for every interval, then 


(62) 


a(t)dt = lim Sa(^)(& 


i + 1 


ti). 


r rhe conditions for the existence of this limit are exactly the same as in the 
scalar theory. 

If M is the least upper bound of | a \ on C and L is the length of C, it follows 
in the usual manner that 


(63) 



< 


| a(t) | mod dt < ML. 


As an illustration of these definitions we shall now employ contour integration 
to prove some of our earlier results. If x is an arbitrary constant matrix and 
C a circle with center t — 0 and radius greater than | x |, then all the roots of x 
Vic. inside C and on C the series 

1 =! + £ + £!+... 
t - x i T ? T r 


is uniformly convergent. Hence 



FUNCTIONS OF MATRICES 


130 


[ VIII ] 


a result which may also be derived from the definition of log (t — x) in §8.07 and 


We then have 



= [log (t - X))c. 


1 f tdt 
2m Jc t ~ X 



= X 


and in general, if g(t) is a scalar function of t which is analytic in a region 
enclosing C, 

(65) g(x) = dt. 

2m Jc t — x 

Suppose now that 1 1 — x | = 0(t)<p(t) 9 adj (t — x) = 0(t)a(t) whore 0(1) is 
the highest common factor of 1 1 — x | and the coordinates of adj (t — x). We 
then have 


(66) - 25 1 % am 

and under the given conditions this vanishes if, and only if, g(t)/<p(t) has no 
singularities inside C, that is, if <p is a factor of g. We have therefore the theorem 
of §2.05 that <p(t) is the reduced characteristic function of x and that </(x) ■■= 0 
only when <p(t) is a factor of g(t). 

Since a(t) = <p(t)/(t — x) is a polynomial in x with scalar coefficients and with 
degree 1 less than the degree of <p(t), say 


a(t) = axx m ~ 1 -b onx m “ 2 + • • • + a M , 


equation (66) shows that g(x) can be expressed as a polynomial in .r, nani(‘ly, 


(67) 



g% (<) <u 

<pif) 


We may also note that (66) leads to the interpolation formula, §8.01 (-1) if 
the integral is expanded in terms of the residues at the zeros of y(7). 

All of these results can be extended to unilateral series in .r with malrie 
coefficients if care is taken to use g(t) (t - x)~ l or (/ - a-)-' g(j) according as 
dextro- or laevo-lateral series are desired. 


8.13 Functions of a variable vector. Before considering functions of a, 
variable matrix, we shall consider briefly those of a variable vector; for more 
extended and systematic treatments the reader is referred to treatises on vector 
and tensor analysis. 

The differential of a function of a variable in any non-commutativc algebra, 
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was defined by Hamilton as follows. Let f(x) be a function of a variable x, 
dx a variable independent of x and t a scalar variable; then 

(68) df(x) = lim tdx \ ~ 

t-+ o t 

We shall assume tacitly hereafter that this limit exists for all the functions we 
shall consider. 

An immediate consequence of (68) is that df(x) is linear and homogeneous in 
dx. Hence, if x = dx = 2d£ie i} then 

df(x) = lim + ^ e Ar fOBfA = 2 M. db. 

t-> o t d£i 

This leads to Hamilton’s differential operator 


(69) V = 4- 

Oqi 

in terms of which we may write (68) in the form 

(70) df(x) = (SdxV)f(x). 

In using this operator it is frequently convenient to place it after its operand 
and, when this is done, some artifice is necessary to indicate the connection 
between them. This is done by attaching the same subscript to both; the 
method of doing this will be clear from the following examples in which a = 
2ai<u, l> = are vectors and A = || a iS || is a matrix. 


„ v , da; _ ~ dai 

V(l . CiCj, ^aV a ~r~ CiCy, 

dij 


VSab = Y a SaJ) + V a Sab 




dai dpi 


SV a Vp8aJ>t 2 a|.’ 


d„V, 


- = v-' 1 - 


\ i 


dOjk 

db 


CiCjSeici ), 


S(l a AV a = tti 


dai 

db : 


SaAoVc 


= 2 


dOij 

db' 


We can now consider the effect of a change of variable from x to 4 x. Let 
x = V = 2ad/d then 

(71) dF = x cSVadx = Jdx 


4 1-Iere x denotes a new variable and not the conjugate imaginary. Instead of con- 
sidering a change of variable, we may regard x as a vector function of x. 
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where 

(72) J = x a SV a = || dh/dii II 

is the Jacobian matrix of the transformation. Similarly 


dx = X a SV a dx = Jdx. 

Hence 

(73) Jj = 1. 

Again, since SdxV = SdxV = SdxJ'V, hence 

(74) V - J'V, V = (JO^V = J'V. 

From (70) and (72) we see that the differentials of J and J' are given by 

^ j ^ dJ = SdxVa-Ja, and dJ = dx a SV a = J a dxSV a 

dJ' = SdxV a - J a, dJ' = VaSJadx = VaSdxJ' a . 

This leads to the notion of contravariant and covariant vectors. If a is a 
vector function of x and u the corresponding 5 function after the change of 
variable, u is called contravariant if 

(76) u = Ju, 


and covariant when 


(77) u = J'u, u = J'u. 

If d h d 2 denote two independent variations so that di(d 2 r) = d 2 (d\X) y then 


d 2 dix = d 2 (Jdix) = d 2 JdiX + Jdid 2 x 

= J a d 2 xSV a diX + Jd x d 2 x. 

Hence second differentials are neither contra- nor co-variant. 

If A is a matrix whose coordinates are functions of x } the bilinear difFerenlial 
form SdixAd 2 x when transformed becomes 


Sd\xAd 2 x = SdixJ r AJd 2 x 


so that, if this form is invariant, that is, SdixAd 2 x = Sd x xA d 2 x, we must have 
(79) A = J'AJ , A' = J'A'J, Adx = J'Adx. 

6 As will be seen below, this does not necessarily mean merely the result of subst ituting 
x for x in the coordinates of u. 
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Hence when A is defined in this manner, Adx is a covariant vector. If by 
analogy with (78) we form a second differential of this vector and of A'dx, we 
get, using dx = Jdx, 

dAAdyx) = dzJ'Ada + J'd 2 (Ad 1 x ) 

d^A'd#) = diJ'A'd 2 x + J'<h(X'da). 

From (75) d->J r = 'V a SJ a d 2 x, d v .T = V a SdixJ' a ; hence after a simple reduction 
diiAdtx) + ddA'd.x) = V a Sdix(J'A'J a + J' a A'J)d 2 x + J'id^Adyx) + d 1 {A'd 2 x)) 
= ’V a Sdix(A' a - J'A' a J)d 2 x + r(d. 2 (ldix) + d^A'c^x)) 
which may be written 

a = d»(Adix) + di{A'd 2 x) — V a SdixA' a d 2 x 
(80) = J'(d 2 (Adix) + di(A'd 2 x) - V^dy xA' a d 2 x) 

= J'd 

so that a is a covariant vector. This vector may also be written 

a = d,AdiX + dyA'diX — V a SdixA' a d 2 x + (A + A')did 2 x. 

Using a notation suggested by the Christoffel symbols we now write 


( 81 ) 


(82) 


[A ; dix, d 2 x] 


\(d»AdiX + diA'dvX — V a SdixA' a d 2 x) 



dnn\ 

dh ) 


d&id&jCk 


A 

d\X, d 2 x 


{A + A')- l (d 2 AdiX + dyA'd,: x - V a SdyxA' a d 2 x) 


= 2(A + A ') _1 [A ; dyX, d 2 x] 
provided that | A + A' | 5 ^ 0. If wc now set 

b = <f , A , \ + dyd.x 

[aiT, d 2 x J 

and use the relation (A + A')~'J' = J~ l (A + A') -1 , we have from (80) 


(S3) b = Jb 

so that b is contravariant. 

If wc set A = l (A + A') + i(A — A') = B + C, we get from (81) and (80) 


[A; dyX, d-,x] = [B; dyx, d 2 x] + [(7; dyX, d 2 .r] 

[B] d L x, d 2 x] + Bdid,x = J'([B ; dyX, d 2 x\ + BdydA) 
[C; dy.c, d 2 x] = J'[C; dyX, d 2 x\. 


( 84 ) 
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We shall require two transverses of the Christoff el matrices; these are defined 
by 

S&H; a, c] = Sc[A; a, &]', Sb 

whence 


(85) 


2 [A; a, 6] ' = V a SaA' a b + A a bSaVc, — A a aSbV« 



[A; a, (A + A')~%'. 


To illustrate partial differentiation we shall consider functions which depend 
not only on x but also on a contravariant variable vector u = 2u , Since 
u = Ju and J = || df;/ 1| is independent of u, wc have 


_9_ = V d J*> J_ = V d Ii JL 

do>i AJ q u . dZi 9w j 

^ % dl- daf 

Hence, if V' = Sad/dau, V' = 2a d/du { , then 


(86) V' = J'V', V = J'V + dJ'-V 

where d,J' = SuV a -J' a . Here V' is covariant but V is neither covenant nor 
contravariant, which means that formulae dependent on it will not usually be 
invariant in form under a change of variable. This difficulty is avoided as 
follows. If we combine (83) with (78) and replace d,x, d«x by contnivuriant 
vectors a, b, then 


(87) 

Hence 


di>J Cl ciaJb — JocbSV a Cl 




(87') 

and therefore 



- J' 



whence 


V 


J'V 


+ 


f A \ 

u w 




f -T V 

l*. *'/ 


( 88 ) 


D = V - 



J'lV 



D is therefore a covariant differential operator. 


= J 71 ; 
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Let v be any contravariant vector and set d u v = SuV a -v a ; then, if / is any 
function of x and u, 

(89) d v f m (SvV a + Sd u vV' a )f a = (SvD a + S8 u w' a )f a 

where 8 u v is the contravariant vector defined by 

(9°) 8 u v = d u v + / A \ = Vu. 

{u,vj 

1 he tensor corresponding to the matrix V is known as the covariant deriva- 
tivc of v. 


8.14 Functions of a variable matrix. r lhe general theory of analytic func- 
tions of a variable matrix x = |[ ^ || is co-extensive with that of n 2 scalar 
variables and hence is so general as to be void of properties peculiar to matrices. 
This follows immediately from the obvious relation 


as 1 


€ piXCjp 


which expresses the (i, j) coordinate as a linear function of x. 
The differential operator 6 corresponding to x is 


( 91 ) 


A = 


_d_ 


It, is often convenient to have a special notation for the transverse A' and when 
this is so \ve shall set 


(92) 



= A'. 


These operators may stand after their operands and the same convention as 
was used for subscripts attached to V will also be used for A and d when 
necessary. 

The fundamental property of A is 


(JW) 


df = tr (dxA')f = tr (dxd)f 


where / is any function of x and tr(A) stands for the trace of the matrix yl. 
This result, follows immediately from 


tr (dx A') = 22 (l ^ ik 

i k 


_d_ 

Wnf 


0 This operator first occurs in a paper by Taber (1S90, (84)) who however did not make 
any systematic use of it. Macaulay in a tract published in 1893 (110) but written in 
1887 used A consistently in applying quaternions to physical problems; he used the nota- 
tion G for A. Later Born (385) used the same operator to great effect in his theory of 
quantum matrices. Turnbull (436) uses it for A'. 
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8.15 Differentiation formulae. We collect here the principal formulae of 
differentiation; in each case the operand is x or x f and the dummy subscript is 
omitted except when the meaning is ambiguous without it. To simplify the 
expressions we set a and ft for the traces of the matrices a and b , and for the 
trace of x r . 

A axb = a'b — a'xb'A, A ax'b = ab — bx'aA 

(94) 

A' axb = ab = bxaA'j 


Ax r = x r “ 1 + x'x r ~ 2 + ( x') 2 x r 


A' ax'b = a'b — a'x'b'A' 

-* + ••• + (x') r ~ l = A'(x') 7 


(95) x r A — x r ~ 1 + x T " 2 x' + x r ~ \x'Y + • • * + (x') T 1 = 
A'x T = nx r “ 1 + hx r - 2 + &x r - 3 + * * * + £r = o: T A f 
Atr (axb) — Atr(5a:r) —a'b' 
dtr (axb) = 6cr 

A tr (x r ) = Af r = r(a;') r “ 1 , <3 tr (x r ) = nr r “ l . 

tr (A) axb — ab = tr (d)aa:& 
tr (A)rr r = nr r “ 1 = tr (d)x r 
jx 


(x') r A' 


(96) 

(97) 

(98) 

(99) 

( 100 ) 


„ , fx a 
d tr \ 


Atr 


[r s 

x a 
s 


= (r + s) 


= (r + s) 


\r — 1 

x' 
r — 1 


tr (d) 


x a 
r s 


+ 1 




xA = tr (x r ) = 


(A a x a — x'A a ) x r a - (x') r . 


The proofs of these formulae are all very similar and we shall consider hero 
only the most important leaving the remainder to the reader. If a — || a ti ||, 
b — || bu ||, then 


A axb 


hence also 


Q'pqkqrbrj 

d<iip 



a'x'b'A' = (A bxa)' = (b'a)' — a'b. 


The remaining parts of (94) follow in the same way. It follows also from 
(94) that 

A a x r a = A a x a x r ~ 1 + A a xx r a ~ 1 = x r ~ 1 + A a xx a x r “ 2 + A«.r 2 .r« “ 2 
= x' ~ 1 + x'x' ~ 2 + AjPx.'fi* ~ 3 + A a xh' r a ~ *\ 
and so on; the remaining parts of (95) follow from (94) in the same way. 
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To P rove (96) we notice first that tr (ab) — tr (ba) and hence in tr (aa a 2 ■ * * a>r) 
the factors may be permuted cyclically. Then, if c = |] c a ||, 


0 tr (fix) ^ y Gi} Cpqiqp — ^ 7 Cifiij — C. 


Formula (97) follows by repeated application of (96); thus 
A tr (x r ) = A a tr (x a x r ~ x ) + A a tr (x r a “ hr) 

= (x') r “ 1 + A a tr (; x*x r ~ l ) + A a tr (x r a ~ V) 

= ( s ') r “ 1 + Or ')'" 1 + ••• 


= v(x ') r ~ L 

The remaining formulae are proved in the same way. 

If Ncv r \ r is a scalar power series and f(x) = then from (97) and (98) 

0 tr (/(*)) = “ 1 = /'(*) = tr (d)f(x) 

so that the operators 0 tr ( ) and tr ( 0 ) have the same effect on such functions. 


Similarly, if F(x) = 2a r 


x a 
r s. 


, it follows from (99) that 


(101 


0 tr (F) = tr (d)F = 2(r + s r ) 


x a 
r — 1 s r 


8.16 As an illustration of the application of the formulae of the preceding 
section we shall give some parts of the theory of quantum matrices which are 
applicable lo matrices of finite order. 

Let tj i, (fa * •, (\f \ p { , • • •, ps be the coordinates of a dynamical system and 
$) Uie Hamiltonian function; these coordinates satisfy the system of ordinary 
partial differential equations 


( 102 ) 


<li 


<1$ 

Fpi 


Vi = 




O' = 1,2, •••, /). 


We may suppose that / = // 2 , a perfect square; for, if (// — l) 2 < / < n 2 , we 
can int.roduce 2 (n 2 — f) additional coordinates q f 4 i, p/ + 3 , • • </„ s, p tl t which 
do not occur in Sp so that these variables equated to constants are solutions of 
the extended system. When this is done, we can order the 7 s and //s in square 
arrays || 7,7 ||, || pa || in such a way that pa corresponds to qp for all i and j. 
Liquation ( 102 ) then becomes 

0 <o . d$b 

dpji dqa 

or, if the mat, rices || 7*7 || and || || are denoted by q and p and the corre- 

sponding transverse differential operators by d q and d p , 


(103) 


7 = 0p§, p = 0 S §. 
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If we transform (103) by the substitution 

q = e wt Qe~ wt , Q = || Qa ||, p = e wt Pe~ wt , P = || Pij ||, 
where w is a constant matrix, we get 

q = e v>t(Q + W Q- Qw)e~ wt , p = e"*(P + wP - Pw)* 

Also, if t is kept constant, 

tr (dqd q ) = tr (e wt dQe~ wt d q ) = tr (dQe~ wt d q e wt ) 
with a similar relation for p. Hence 

d q = € w< aQ€- w s a P = 

Using these results in (103) we get 

(104) (Q + wQ - Qw) = 6p£>, (P + wP ~ Pw) = -0 O $, 

§ being expressed in terms of P and Q and, if necessary, also t. Now from (96) 
Qw •— wQ = dptr [w(PQ - QP)], -(. Pw - wP) * 0 o tr [w(PQ - QP)} 

and hence, if 

(105) ® s $ + tr[w(PQ - QP)] = $ + tr[w(p<? - r/p)], 
we have in place of (103) 

(106) Q — dp$, P = — 0<^, 

so that the transformation is canonical. 

If Q = 0 = P in (104), then 

wQ — Qw = wP — Pw = — 3q$ 

or on restoring the exponential factor 

(107) wq — qw = d p $ y wp — pw = — 9^. 

When $ is given, these are algebraic equations which can be solved for p and q; 
the solution will of course generally contain arbitrary parameters. 

Under the same assumptions (106) becomes 

(108) dr® = 0 = d Q ®, 

and if P, Q are independent variables, the only solution is ® = constant anti 
the only solution for § in (107) then has the form 

£>* = tr [w(pq - qp)] 

apart from an additive constant. Equation (108) may then bo written 

(109) 6 P (£ - §*) = 0 = 0,(£ - §*)• 
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Now, if n, r 2 , * • *, r m are the parameters in the solution of (107) we have 
d($ - £>*) _ y dp*- a(g - £*) , y a(g - 5*) 

an, 9r/ c ap t 7 ^ drjc dqa 

which vanishes in virtue of (109). Hence, if § is expressed in terms of n, 
r 2 , • • r m by using the solutions of (107), it will differ by an additive constant 
at most from — tr [w(pq — qp)]. 


CHAPTER IX 


THE AUTOMORPHIC TRANSFORMATION OF A BILINEAR FORM 

9.01 If the variables of a bilinear form whose matrix is a are Iranslormod 
cogrediently by a matrix x, the matrix of the new bilinear form is x'ax; when 
this new form is identical with the old, the transformation is said to be 
automorphic . The problem of finding all automorphic transformations of a is 
therefore equivalent to solving the equation 

(1) x'ax — a. 

We shall assume for the present that | a | 0 in which case also | x | ^ 0. 

It follows from (1) that x'a = ax~K Hence, if /(A) is a scalar polynomial 

(1') lf(x)]'a = f(x')a = af(t t*" 1 ). 

In particular, if /(A) = (1 — X)/(l + X) and y = f{x~ l ), then 


(2) 

provided | x + 1 | 0. 

(3) 


y 


l 


1 + x~ L x + 1 
Hence from (1') y'a = 
y' = - ciy a ~K 


= -fix) 

— ay so that 


Conversely, if y satisfies (3) and | 1 — y J s* 0, then x = (1 + ?/)/(! — //) is a 
solution of (1) such that | x + 1 | 7* 0. For from (3) f(y')n = aj(-y) so I hat 


, 1 +y' l—?y 

x a = - — a — a- 


1 - '//' 


1 + // 


ax~ 


Similarly, if | a* - 1 | ^ 0, we may set x = (1- ?/)/([ + y) and then y is a, 
solution of (3) such that | 1 + y | ^ 0, and conversely. The effect of ! he { ra in- 
formation (2) is therefore to reduce the solution of (1), which is quadratic in x, 
to that of (3), which is linear in y , except when both 1 and — 1 are roots of x. 
It is because (3) is linear that it is more convenient than (1 ); in particular if we 
regard 


O') V'a + ay = 0, y = || ||, a = || a M ||, 

as a system of m 2 linear homogeneous equations in the r/’s, then the rank of the 
system gives the number of parameters which enter into the solution when those 
values of y (or of x) are excluded for which both 1 and - 1 are roots. 

Since the main problem is thus reduced to the solution of linear equations, it 
may be regarded as solved; the solution, however, can be given a somewhat more 
definite form as we shall now show. 
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9.02 The equation. y r = dz ay a l . We shall consider in place of (3) the more 
general equation 

(4) y ' = day a- 1 , d = ±1. 

Forming the transverse we get y = da'^y'a' or y f = da'ya'" 1 , whence 

(5) 2/a-V = a^a'y, a'ya '”“ 1 = aya -1 

so that ?/ is commutative with a~ l a' . Now from (4) we have y = da^y'a and 
hence 

2?/ = y + da~ l y'a. 

But if 6 is any matrix commutative with a~ l a' f then 

(0) y = b + da~ l b'a 

is a solution of (4) ; for on substituting this value of y we get 

2 / — daya~ l = b' + — daba" 1 — b' = 0 

since, as in (5), a'ba'~ l = aba" 1 . It was noted above that y has this form and it 
therefore follows that the general solution of (3) is obtained by setting 

(7) y = b — a~ l b r a, ba^a' = a~Wb. 

It should be noted, however, that two different values of b may give rise to the 
same value of y. 

9.03 We are now able to give a solution of (1) under the restriction that either 

| x + 1 | 7^ 0 or | x — 1 | 0. Since the first condition is transformed into the 

second if x is changed into —x, it is sufficient for the present to assume that 
| x + 1 I t* 0, and in this case the value of y given by (2) is finite. In terms of y 
we have 


or, if 

1+2/ 1 + 6 — a l b'a , 

l-y~ 1 -b+ a~ l b'a ~ V 

- ab + b'a)" 1 (a -f- ab — b'a) 

(8) 

1 

iO 

II 

b'a , 

then 



(9) 

x = (a - c)- 1 (. a + c), 
x = (c — a) -1 (a + c), 

1 x + 1 ] 7^ 0, 

I X — 1 | 7^ 0. 


It follows as in §9.01 that, if x has this form, it is a solution of (1). 
In place of (8) we may define c by 

(10) d — — a'a" l c — —ca~ l a f . 

For from (7) and (8) 

d = b'a' — a'b = — a'a~ l c = —ca~ l a' 
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and, if c is given by (10) and x by (9), then 

x r — ( a ' + c') (< a ' — c')” 1 = ( a ' — ca~W) ( a ' + ca~ l a')~ l 

= (1 — ca~ l ) (1 -|- ca~ l )~ x = (1 + ca~ l )~ l (1 — ca~ l ) — a(a + c)"" 1 (a —• c)a 1 
= ax-t-a- 1 . 

If a is symmetric, (8) or (10) gives c' = — c , and c is otherwise arbitrary except 
that \ a — c \ ^ 0; in particular if a = 1, (9) reduces to the form of an orthogonal 
matrix already given in 6.03. Similarly if a is skew, (10) shows th at c is an 
arbitrary symmetric matrix subject to the condition that | a — c \ 0. 

The case in which a is symmetric can also be handled as follows. We can set 
a = b 2 where b is symmetric and, if 

y = b~ l xb, 

equation (1) gives yy' = 1 so that y is orthogonal. Conversely, if y is any 
orthogonal matrix x — byb _1 is a solution of (1). 


9.04 Principal idempotent and nilpotent elements. Since x is similar to 
{x) x , the elementary divisors which correspond to roots other than dt 1 occur in 
pairs with reciprocal roots. If we arrange these roots in pairs g r} cf r l and denote 
the corresponding principal idempotent elements by e r and e_ r , respectively, we 
may set 

(11) x = 2 [g r (e r + Zr) + g r l e- r (l + f-J- 1 ] + 6i{e\ + £i) — 0«(e^ + 

where the £’s are nilpotent, e h e _i are the principal idempotent elements cor- 
responding to 1 and -1, if present as roots, and 6 h $ 2 are either 0 or 1 The 
form of x- 1 is then 


(12) x 1 = 2 {gXe- T + £_,) +g r l e r (l + f r )~i] + 0,^(1 + £,)-i _ ,( 1 + t 1 

and (11) gives 


(13) 

/ 

c r = 

/ 

ae^rcr 1 , 

t 

= ae r ar x 


Sr = 

at-. r ar l , 

£- r = a£ r a~ l 

(130 

Ci = aeicr 1 , 


e'-i = ae-id-' 

ti = -a -JL 
^ x 11 

T a_l , 

l-i = -a ~r 


We require also the form of x + a:' 1 and x — x~ { ; if 
(14) = ^ + ft. = $ r _ 
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or if 

_i a r 

y r = QrCir - 0 / , 

i + OL r 

(15) d r = g r %r + Or 1 £ , 

5i = 2 %i ■— 7 j ,. 

we have 


l + £i’ 


7-1 = 


& 

1 + f-i 


8-r = 0r$-r + 0? 

1 “T q—r 

6_i — 2J_i — 7_i 


(16) X + X 1 — h [({Jr Q r l ) ( e r + C-r) + 7 r] + d\(2ei -f- 7i) — 02(26—1 + 7-0 

(17) X X = L [{((7r g r) C r + &r} — { (</r — g 0) C— r ~f“ <5— r} ] “H $1^1 — ^2^—1 

where the elements grouped together are principal elements. 

The principal idempotent elements of x — x ~ 1 are also principal idempotent 
elements of x except that roots 1 and — 1 of x both give the same root 0 of 
x — x~~ u , no root of x other than ±1 leads to the coalescing of roots in x — x ‘~ 1 . 
If we put 


(18) 2 u = x + x~ l , 2v = x — x~ l , 

then v is a solution of (4) with 5=1 and v is a solution with 5 = — 1 ; also 


(19) 


x 2 — 2vx —1 = 0 


which has the formal solution 


(20) x = v + ( v 2 + l) ?j . 

Here v 1 -f- 1 = u 2 so that (v 2 + l) s exists whenever x is a solution of (1). Con- 
versely, if v\ is any solution of (4) with 5 = — 1 and if is a determination of 
(v'\ + l) 1 such that 

( 21 ) u [ o — auij 
t hen x is a solution of (1); for 

. ; r . 

x a = v i<i + 11 i a — —avi + (iih = uz 

since 

(«1 + Vi) ( Ui — t»i) = ul — v\ = 1. 

If j) 2 + 1 has no zero root, determinations of (v 2 + 1)* always exist which are 
polynomials in v 2 and therefore satisfy (21); but even in this case this does not 
give all solutions. The situation is as follows. The general form of v is given 
by (17) if we replace g r — g~ l by, say, 2 Jc r and 6181 — 628-1 by o 0 . When k r is 
given, then g T is determined; from (13) and (14) we have (s' r + sl r )a = a(S r + 
8 - r ) and therefore, if k 2 r + 1 ^ 0, the part of ( v 2 + l) 1 corresponding to e r + e_ r 
exists and satisfies (21) ; we therefore get all valid expressions for this part of 
(v 2 _|_ 1)1 by using the form of the square root given in §8.05 with the restriction 
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that the only sets of partial units that may be used are those that satisfy (21). 
However, since ( v 2 + 1)* = u , (16) and (17) show that we need only use the 
idempotent elements e r + which are determined by v, in those parts of the 
square root which do not depend on the zero root of v; e L and c_i, however, are 
not defined by v so that it is necessary in any particular case to consider how e 0 
and <5 0 can be broken up into parts which have the required property. 

If v has a zero root with the principal idempotent and nilpotcnt parts c«, 5o, 
then e 2 = e shows that, although b' Q a = — ad 0 , we have 

(21') e' 0 a — aeo. 

We therefore seek to divide eo into two idempotent parts, c\ and which are 
commutative with v and therefore with 8q. In forming the square root we then 
attach the value +1 to e x and — 1 to e__i. 

If k 2 r + 1 = 0, then g r — i and the corresponding part of (v 2 + 1)* is 2 i(8 r + 
8-r) + 8l + b\, and it is readily shown from (15) that this has a square root. 
The details are left to the reader. 

If b is a solution of b'a = — ab, then so are also t = tan b and v = tan 2b. A 
short calculation then gives x = (1 + t)/{ 1 — t) subject to the restrictions 
already given; this shows the relations between the rational and irrational 
solutions. 

9.05 The exponential solution. Some of the difficulties of the solution in 
§9.04 can be avoided by setting 

(22) x = exp(z) ss e 3 , z — Log x 

where a principal determination of log x is to be used. Since this deferminat ion 
of log x is a polynomial in x and x'ax — a , we have 

(23) z' = Log x' = Log ax~~ l a~ l = a (Log xr x )a 1 

and therefore 

(24) 1 = x'axa = e 2 'e aza ~ l = e~ 

From (11) 

2 = 2 [(log g T ) e r + t]r- (log g,) C_ r + 77_ r ] + 0 lVl + 0 a («V_i + >; ,) 

V* = ^ — hil + ■ ■ ■ (« = r, —r, 1, — I ) 

and from (13) 

y'» = — arj^cr 1 (s = r, — r), 

Vs = — (,s s= 1, — 1). 

z f -j- azar 1 = 2 [(log g r )e r + rj r — (log g r ) e_ r + r?_ r ] -f Qiy[ + O^nie'^ -f r\ ' { ) 

+ 2 [(log g r ) eL r - 1 Ur - (log g r )e' r - y' T \ - e lV [ + Oniric - 77 ' ,) 

= 2d2irieLi, 


(26) 




e s = ac~ s a~ l , 

f , 

e s = ae s a x , 
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and therefore, if we set 

(27) f = Oe-i (d s 0 2 = 0, 1), w = z — 7 rtf, 
we have 

(28) w' + awar 1 = 0, f' — a£cr l = 0, 
and 

(29) w = 2 [(log ^ r )e r + 7? r - (log £ r )e_ r + 7 ?_ r ] + 0^ + %_!. 

The general value of x can therefore be expressed in terms of the solution of the 
equation discussed in §9.02. 

If we now start with w as a solution of (28) and define x by x = e w , then 

x' = e w ' = € ~ awa_1 = ae~ w a~ l = ax^a~ l 

and therefore a: is a solution of (1); to obtain every solution, however, we must 
add the terms ti<; to w. 

If Co is the principal idempotent element corresponding to the root 0, then 
(29) shows that the presence of the f-term depends on the division of e 0 into two 
parts Ci and e_i which satisfy the second set of equations in (26) ; and correspond- 
ing to these we have nilpotent parts tji and rj _i which give rise to 1 and — 1, 
respectively, as roots of x , or 0, wi as roots of z. 

A form which gives rational parameters is obtained from the exponential 
solution as follows. Let 


( 30 ) 
then 

(31) 
and 


t = tanh (s/2) = - — ^ 
e* + 1 


x — 1 
x + 1 


± 

1 - t 


-i __ ajra ~ x — 1 _ 1 — ax~ l or l _ 1 — x' 
axa 1 + L 1 + ax.-Ur 1 I + x' 


so t hat (31) gives a solution of (1). If, however, | x + 1 | = 0, then t becomes 
infinite so that (31) cannot give directly any x which has —1 as a root. This 
difficulty arises from the fact that tanh(0/2) °o when 0 — wi; but, since (t + 1)/ 
( t — 1) = e for all values of t which do not have an infinite root, that is, one 
corresponding to a root (2k + l)7ri of 2 , hence x will be a solution of (1) so long 
as the coordinates of z are continuous functions of the parameters involved 
and the limiting value of x is finite and determinate. 


9.06 Matrices which admit a given transformation. In (1) we may regard x 
as given and a as unknown ; the problem then is to find all matrices a such that 

(32) x'ax = a. 
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If we associate with a = || a l7 || the corresponding tensor of grade 2, 

u = '2a i je i ej, 

we see immediately that (23) corresponds to setting (cf. §5.10) 

(33) II %(x)u = u. 

Hence there is a solution if, and only if, x has at least one pair of reciprocal roots; 
in this case XI 2 (x) has one or more roots equal to 1 and the various invariant 
elements corresponding to this root give a linearly independent set of determina- 
tions of a. 

When it is required that | a | ^ 0, another form of solution is preferable. In 
this case x' = ax~ l a ~ l ; but, since x and x f are similar, we also have x' = pxfr 1 , 
where, if p x is one determination of p, the general form is 

(34) p = p x b, bx = xb } \b \ 0. 

Hence it is necessary that x" -1 be similar to x, say 

(35) x~ l — ql l xq\, 
which gives immediately 

(36) a = pibqi. 

Conversely, if p u 5, and q x satisfy the given conditions, it follows immediately 
that (36) gives a solution of (32). 



CHAPTER X 

LINEAR ASSOCIATIVE ALGEBRAS 

10.01 Fields and algebras. A set of elements which are subject to the laws 
of ordinary rational algebra is called a field . We may make this idea more 
precise as follows. Let a, b, • • • be a set of entities, F , which are subject to two 
operations, addition and multiplication; this set is called a field if it satisfies the 
following postulates i 1 

A!, a + b is a uniquely determined element of F. 

A2. a b — b + a. 

A3, (g -j~ b') -j“ c — a -f- (b — j~ c). 

A4. There is a unique element 0 in F such that a + 0 = a for every element a 
in F . 

A5. For every element a in F there exists a unique element b in F such 
that a + b = 0. 

Ml. ab is a unique element of F. 

M2, ab = ba. 

M3. ab'C = a-bc. 

M4. There is a unique element 1 in F such that al = a for every a in F. 

M.5. For every element a 0 in F there exists a unique element b in F such 
that ab — 1. 

AM. a(b + c) = ab + ac } (b + c)a = ba + ca . 

R. If m is a whole number and ma denotes the element which results from 
adding together m a’ s, then ma ^ 0 for any m > 0 provided that a ^ 0. 

If M2 is omitted the resulting set is said to be a division algebra . This does not 
imply that M2 does not hold, only that it is not presupposed; if it does hold, 
the algebra is said to be commutative. If M2, 4, 5 are all omitted, the cor- 
responding set is called an associative algebra. If the algebra contains an identity, 
that is, an element satisfying the condition laid down in M4 for 1, this element is 
called the principal unit of the algebra. Postulate R is included merely as a 
matter of convenience; its effect is to exclude modular fields. In consequence 
of R every field which we shall consider contains 2 the field of rational numbers 
as a subset. 

As an example of a field we may take the field of rational numbers extended 
by a cube root of unity, co = (- 1 + V 3 3)/2. Every number of this field can 
be put in the form 

a = a + j&o — al + fa 

1 These postulates are not independent; they are formed so as to show the principal 
properties of the set. In place of M5 it is often convenient to take: M5' If a ^ 0, ax = 0 
implies x — 0. 

2 strictly speaking, we should say that the field contains a subset simply isomorphic with 
the field R of rational numbers. This subset is then used in place of R in the same way as 
scalars are replaced by scalar matrices in §1.04. 
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where a and p are rational numbers; the form of a is unique since a + — 

7 ~j- dec gives (/3 — 6) o) = y — a and, since oo is not rational, this is impossible 
unless p - 8 = 0 = y - a. We say that 1, a> is a basis of F relative to the field 
R of rational numbers, and F is said to be a field of order 2 over R. 

As an example of an associative algebra we may take the algebra of matrices 
with rational coordinates. Here any element a of the algebra can be put 
uniquely in the form a = 2 a* 3 6 » 7 , where the an are rational numbers; and 
j = 1, 2, • • • , n) form a basis of the algebra, which is of order nr. We 
also have an algebra if the coordinates a%j are taken to be any elements of the 
field F = (1, <a) described above. This algebra is one of order n 2 over F. 
Instead of regarding it as an algebra over F we may clearly look on it as an 
algebra of order 2 n 2 over R the basis being e^, coea(i, j = 1, 2, * • • , n). 

10.02 Algebras which have a finite basis. Let A be a set of elements which 
form an associative algebra and G a subset which is also an algebra. We shall 
say that a x , a 2 , • • • , a n form a basis of A relatively to G if (i) each a» lies in A, 
(ii) if every element of A can be put uniquely in the form 

(1) a = 7ifli + 72^2 + * • • + 7 nOn 

where the y’s belong to G . Though it is not altogether necessary to do so, we 
shall restrict ourselves to the case in which G is a field which contains the rational 
field, that is, we assume as a postulate: 

BR. For every algebra A under consideration there exists a non-modular 
field F and a subset of elements a h a 2 , * • • , a n such that (i) every element of A 
can be put uniquely in the form 


n 



and (ii) every element of this form belongs to A ; and further the elements of F 
are commutative with a x , a 2 , • • ■ a n . 

Since the product of any two elements of A is also an element of A, and can 
therefore be expressed in the form (1), we have 

( 2 ) yijkQ>k 0, j = 1 , 2 , * ■ - , n) 

k 

where y t3 / c are elements of F. Since the law of combination of the element s of F 
is supposed known, (2) defines the product of any two elements of A ; for 

(3) (2ar t -a<) (SftOf) = 'ZoLiPfliOi = ^aSjy iik a k . 

If the values of the 7 J s are assigned arbitrarily in F f it is readily shown that, the 
only postulate which is possibly violated is M3 which states that ab-c = a 'be ; 
and in order that this condition shall be satisfied it is necessary and sufficient 
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that ai-ajOh — a* for all the elements of the basis. This gives immediately 
the ‘associativity’ condition 

(4) yijaVM (i, j, k, l = 1, 2, - • • , n). 

a a 

10.03 The matric representation of an algebra. If we set 

n 

A{ =: yiqjy&pq (f — 1 ? 2 ? * • • , ? 

P, Q = 1 

the law of multiplication for matrices gives 

A-iA.]' = S'Y iapyjqa^pq 

arid therefore from (4) 

A {A j = ^yija y ctqp &pq ~ ^yija A a . 

Hence the set of matrices of the form 2ouAi is isomorphic with the given algebra 
in regard to both addition and multiplication. Further, if the algebra contains 
the identity, the isomorphism is simple; for, if there exist elements at of the field 
such that i = 0, it follows that 

(Zona^x = 0 

for every element x of the algebra, and putting x = 1 we get ZaiCu — 0. 

If the algebra does not have a principal unit, all that is necessary is to replace 

(5) by 

n + 1 

(( )) A{ = ^ y l/iqp Cpq 

P,Q= 1 

where yij, n ■ 1 1 = 0 (i,j < n) andyn+M,* = = 7 i,»+i,yfor all i and j. 

r Fhe importance of this representation is that it enables us to carry over the 
theory of the characteristic and reduced equations from the theory of matrices. 
The main theorem is as follows. 

Tiikouem 1 . The general dement x = S&ai satisfies an equation of the form 
( 7 ) \ m + +•••+&«* = 0 

where b p is a rational homogeneous polynomial in the £s of degree p; and if the 
variable coordinates £„ are given particular values in F, there exists a rational 
polynomial 

(8) *>(X) = + PM~ X + * • • + Pp 

such that (i) <p(%) = 0, (ii) if fi(\) is any polynomial with coefficients in F such that 
= 0, then <p(\) is a factor of ^(X). 
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This theorem follows immediately from the theory of the reduced equation 
as given in §2.05 and from the fact that the equation which is satisfied by the 
general element must clearly be homogeneous in the coordinates of that element. 

As in the theory of matrices, —bi is called the trace of x and is written tr(;r). 
The trace is linear and homogeneous in the coordinates and hence tr(;r + ?/) = 
tr(z) + tr(y). 

10.04 The calculus of complexes. If x\, X 2 , * • • , x r are any elements of an 
algebra A in a field F, the set B of all elements of the form (£; in F) is 
called a complex 3 or linear set. Any subset B of A which has the property t hat, 
when x, y are any two of its elements, then + r\y is also an element, of the set 
is a complex. This follows readily from the theory of linear dependence and the 
existence of a finite basis for A ; it is also easily shown that any subcomplex of A 
has a finite basis; the order of this basis is called the order of the complex. 

We shall write B = (x h x 2 , * • • , x r ); this does not imply that the x’s are 
necessarily linearly independent. If C — (y lf y 2 , ••• , y s ) is a second complex, 
the sum of B and C is defined by 

B + C = (x h x 2 , • • * , x r , yi, y%, • • • , y t ), 

that is, B + C is the set of all elements of the form x +■ y where x lies in B and 
y in C. Similarly the product is defined by 

BC — {xtyj. % = 1, 2 , • j 1 , j == 1)2, * ) &*)• 

The set of elements common to B and C forms a complex called the intersection 
of B and C; it is denoted by B ^ C. If B and C have no 4 * common element,, we 
write B ^ C = 0. If every element of C lies in B but not every element, of 
B in C, we shall write C < B; in this case B + C = B. A complex of order 1 is 
defined by a single element, say X\, and for most purposes if is convenient, to 
denote the complex (a;i) simply by X\\ X\< B then means that X\ is an element 
of B. 

If a complex B is an algebra, the product of any two of its elements lies in B 
and hence B 2 < B; conversely, if this condition is satisfied, the definit ion of t he 
product BB = B 2 shows that B is an algebra. 

We add a summary of the properties of the symbols introduced in this section. 

B + C = C + B, (B + C) + D = B + (C + D), BC-I) = ll-Cl), 

B ^ C = C ~ B, (B ^ C) ^ D = B ^ (C - 23), 

B(C + jD) = BC + BD , (C + D) B = CB + DB, 

B + (C ~ D) < (B + C) ^ (B + D), B(C ^ D) < BC ^ BI ) . 

3 The term 'complex/ which was introduced by Frobenius in the theory of groups, is 
more convenient than 'linear set 7 and no confusion is likely to arise between this meaning 
of the term and the one used in geometry. 

4 To avoid circumlocution we say the complexes have 'no element in common 7 in place 

of the more correct phrase 'no element in common except 0. 7 
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If B C, then B -j- C = C, and conversely. 

If B < C, there exists D< C such that C = B + D, B ^ D = 0. 

If B = C + D and C ^ D = 0, we shall say that B is congruent to C modulo 
D , or 

B = C (mod D); 

and if b, c, d are elements of B , C, Z), respectively, such that b == c + d, then 
b = c (mod D), c = b (mod D). 


10.05 The direct sum and product. If A = (a x , a 2 , • • * , a«) and J5 == (& x , 

• * * , 6js) are associative algebras of orders a, 13, respectively, over the same 
field F, we can define a new algebra in terms of them as follows. Let C be the 
set of all pairs of elements (a, b) where a < A and b < B and two pairs (a, b ), 
(a', b') are regarded as equal if, and only if, a = a 1 , b = b'. If we define addition 
and multiplication by 

(a, b) + ( a ', b') = (a + a', 6 + 5') 

(9) (a, 6) (a', 6') = (aa\ bV ) 

Ha, b) = (&,&) (fin/?), 

it is readily shown that the set C forms an associative algebra. This algebra 
is called the direct sum of A and B and is denoted by A © B; its order is a + fi. 

The set ?l of all elements of the form (a, 0) forms an algebra which is simply 
isomorphic with A, and the set 93 of elements (0, b) forms an algebra which is 
simply isomorphic with B; also 

C = ?l + 93, 9133 = 0 = mi, 31 ^ 93 = 0. 

In consequence of this it is generally convenient to say that C is the direct sum 
of 91 and 93. 

If we replace (9) by 

Ha, b ) = (fa, b) = (a, f b) (f in F) 

} (a, b) (a', V) = (aa', W), 

wo sot another type of algebra of order a/3 which is called the direct product 
of A and B and is denoted by A ® B or by A X B when there is no chance of 
confusion. If both A and B contain the identity, the set 21 of elements of the 
form («, 1) forms an algebra simply isomorphic with A and the set 33 of elements 
(1,5) is an algebra simply isomorphic with B; also 6 

C = 2(58 = 3321, 2l~33 = (1, 1) = 1, 

and the order of C is the product of the orders of 21 and 33. As in the case of 
the direct sum it is convenient to say that A is the direct product of 21 and 33 
and to indicate this by writing C — 21 X 33. 

5 Strictly speaking we should use different symbols here for the identity elements of the 
separate algebras. 
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The following theorem gives an instance of the direct product which we shall 
require later. 

Theorem 2, If on algebra A, which contains the identity , contains also the n nitric 
algebra M ( e if ; i,j = 1, 2, • • • , n), the identity being the same for A and M, then A 
can be expressed as the direct product of M and another algebra B. 

Let B be the set of elements of A which are commutative with every element 
of M; these elements form an algebra since, if b %Cp q — Cpq bi (i = 1 , 2 , • • • ), then 
also 


(bi Ar bj) e PQ — e pq(b% -T 5/), bibjCpq — Cpqbibj, 

Further B ^ M is the field F, since scalars are the only elements of M which a, re 
commutative with every element of M. 

If x is any element of A and 

%pq ^ ^ y eipXCqij 


then 


•Cvq &n 


CipXGqiers — GfpXCqg 


— ^ r 8 


ipXGqi C r n *1 pq 


-2 

i i 

so that x vq belongs to B. Also 

^ y 3'pqGpq ^ y OipXCqiCpq = y* y CppXCqq = it 

P3 p <2 i |> (2 

so that A = iJM, which proves the theorem. 


10.06 Invariant subalgebras. If B is a subalgcbra of .4 such that. 

(10) AB < B, BA < B, 
then B is called an invariant subalgebra of A . If wc se t 

A = B + C, B ^ C = 0, 

the product of any two elements c,-, c,- of C lies in A and hence 
CiCj ~ Cij 4“ bf j , Cij C , bjj B. 

If we now introduce a new operation X defined by 

(11) Ci X Cj = c { j , 

then the operations + and X, when used to combine elements of (’, satisfy all the 
postulates for an associative algebra. To prove this we need only consider ( lie 
associativity postulate M3 since the proofs of the others are immediate. If 
ci, (h, Ci are any elements of C, then both c, X (c 2 X c») and (c, X c 8 ) X c, differ 
by an element of B from Cic»r 3 ; their difference is therefore an element of both B 
and C and hence is 0. 

The elements of C therefore form an associative algebra relatively to the 
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operations + and X . When this algebra is considered abstractly, the operation 
X may be called multiplication; the resulting algebra is called the difference 
algebra of A and B and is denoted by (A — B). 

The difference algebra may also be defined as follows. Let 61, - • • , bp 
be a basis of B and Ci, C2, • • • , c y a basis of C, so that bi, 62, • • • , 5/?, Ci, • • • , c y 
is a basis of A. Since A is an algebra, the product CiCj can be expressed in terms 
of this basis and we may therefore set 

(12) c%Cj — 'EjijkCk -j- 

The argument used above then shows that 

( 13 ) didj = Zyijkdk 

defines an associative algebra when B is invariant. 

It is readily seen that the form of the difference algebra is independent of the 
particular complex C which is used to supplement B in A . For if A = B + P, 
B ^ P = 0, if follows that to an element p of P there corresponds an element c 
of C such that p — c < B; and we may therefore choose a basis for P for which 

Pi = Ci + (qi < B; i = 1, 2, • • • , y). 

Equation (12) then gives 

PiPi = 'ZyukPk + bn 

where bn = q^j + qiCj + aq, + 25^*6* — 2y »,**&* < B , 

and (he algebra derived from this in the same way as (13) is from (12) is ab- 
stractly the same as before. 

If the algebra A does not contain the identity, it may happen that A~ < A, 
A :] < A 2 , a .nd so on. Since the basis of A is finite, we must however have at 
some stage 

A m < A m ~\ A" ,+l = A m ] 

the integer m is then called the index of A. The most interesting case is when 
A m — 0; the algebra, is (hen said to be nilpotcnt. 

When Ah and Ah are nilpotcnt subalgebras of A which are also invariant, 
then Ah + Ah is a nilpotent invariant subalgebra of .1. This is shown as 
follows. Let Mi, ni 2 be (he indices of Ah and Ah respectively; Ah = Ah ^ Ah 
is nilpotent and, since N 3 1 < AT' 1 = 6, its index m 3 is not greater than Wj. Now 

(Ah T - Ah) 2 = AT T“ A^2 T - AhA 2 T AhAh 
< A r l + M + N 3 < Ah + Ah 

since it follows from the invariance of Ah and Ah that AhAh and AhAh are con- 
tained in both Ah and Ah and therefore in Ah- Similarly 


(Ah + Ah) r < 2VI + Ah r + Ah 
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so that, if m is the greater of mi and w 2 , 

cm + mr < m + m + n 3 = n 3 

and hence Ni + is a nilpotent subalgebra. Further 

A (N i -f- Ni) — A Ni -f - AN % ^ Ni IV" 2 
(M + = NiA + AM < Ni + JVs 

so that Ni fV " 2 is invariant. It follows that the totality of all nilpotent, 
invariant subalgebras is itself a nilpotent invariant subalgebra; this algebra is 
called the maximal nilpotent invariant subalgebra or radical of A . 

An algebra A which is not nilpotent and which has no radical is said to be 
semi-simple-, if in addition it has no invariant subalgebra, it is said to be simple? 
We have then the following theorem whose proof we leave to the reader. 

1 heorem 3. If N is the radical of a non-nilpotcnt algebra A, then (.1 — N) is 
semi-simple. 


10.07 Idempotent elements. In the preceding section we defined a nilpotent 
algebra of index m as one for which A m = 0, A m ~ l ^ 0. An immediate con- 
sequence of this definition is that every element of a nilpotent, algebra, is nil- 
potent; we shall now prove the converse by showing that, if A is not, nilpotent, 
it contains an idempotent element. 


Theorem 4. Every algebra which is not nilpotent contains an idem potent element. 

Let A = (a h a 2 , • • ■ , a a ) be an algebra of order a. If ml - .1 f or some 
element a in A, then ax = 0 only when .r = 0; for ml = A implies (hat, aa h 
aa -> ' ' ' . is a basis, which means that there is no relat ion of the form 

0 = Sfcnn,- = aHfm, 

except when every = 0. Also, if ml = A, there must be an element r in .1 
such that ac. = a; this gives ae- = aeorafe- — c) = (land lienee e- - e. 

The theorem is true of algebras of order 1 ; assume it true for algebras of order 
less than «. If a { A = A for some a { , the theorem lias just, been shown to hold. 
If a t A < A for every a, in the basis of A, then, since (a,,! )- = < „,A , 

either a ( A contains an idempotent element or, being of order less than 7 if" is 
nilpotent. Now (AM) r < A(fuA) T and therefore AmA is also impotent’; but 

A, • A a { A A Acii A y AdiA'A A Adi A 

so that Aa { A is invariant and being nilpotent is eontaineil in the radical V of 1 
Hence 

A ' 3 = 2 A(liA ^ n 

i 

* Simple algebras are usually excluded from the class of semi-simple algebras; it seems 
more convenient however to include them. 

The statement that A is not nilpotent is made in order to exclude the algebra of order 1 
defined by a single element whose square is 0. K 
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so that and therefore also A, is nilpotent, contrary to the hypothesis of the 
theorem. It follows that some diA is not nilpotent and being of lower order 
than A contains an idempotent element by assumption. The theorem is there- 
fore proved. 

I he following lemma is an immediate consequence of Theorem 4. 

Lemma 1. A non-nilpotent algebra cannot have a basis every element of which is 
nilpotent j nor a basis for which the trace of every element is 0. 

For, it every element of the basis is nilpotent, the trace of every element of the 
algebra is 0 whereas the trace of an idempotent element is not 0 since the only 
roots of its characteristic equation are 0 and 1. 

If e is the only idempotent element in eAe, it is said to be primitive. An 
algebra which is not nilpotent contains at least one primitive idempotent 
element. For, if cAc contains an idempotent element e x e, then e x (e — e x ) — 0 
so that C\(A ec\ does not contain e — e x and is therefore of lower order than eAe; 
since the order of cAc is finite, a succession of such steps must lead to a primitive 
idempotent element. 

Theorem 4.5. A simple algebra has a principal unit. 

II A. is not nilpotent, it contains an idempotent element e. If a is any element 
of A , we may set a = a x + a 2 where 

d\ = ra + ae — cac < cA -j- Ac, a 2 = a — a Xj ea 2 = 0 — a 2 c. 

We can therefore find a, complex A \ such that 

.1 = cA -j- Ac T A i, cA -j- Ac ^ A\ = 0, cA\ = (3 = A x c. 

If A i is not nilpotent, it contains an idempotent element c' and c -f- c' is also 
idempotent since ce' = 0 = c'c. We can therefore take c + c' in place of c so re- 
ducing the order of A i, and after a finite number of such steps we arrive at a 
si age at which A\ contains no idempotent element and is therefore nilpotent; 
we sha.ll now assume that e was chosen at the start so that A x is nilpotent; we 
shall also assume that c is not an identity for A and there is no real loss of 
generality in assuming in addition that it is not a left-hand identity. 

Let r be the index of A x . If r > 1 and x ^ 0 is any element of A [~ l then 
xA\ = 0 = A\X } cx = 0; if r = 1, then A x — (3 and since c is not a left-hand 
identity, e < cA < A so that there is an x 7 ^ (3 such that cx = 0; we have there- 
fore in both cases 

xAi = 0 = A lx, cx = 0. 

We now have Ax — cAx , Ax A = eAxA ; hence Ax < A , Ax A < A and Ax A is 
therefore an invariant subalgebra of A ; if AxA = 0, then Ax is invariant and 
not equal to A ; if Ax — 0, then xA is a proper invariant subalgebra unless it is 0 
in which case X — {.r} is a non-zero invariant subalgebra of A. In the case of a 
simple algebra it follows that e is an identity. 
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Corollary. An algebra without a principal unit is not scnu-siniple. Iun 
(Ax ) 2 == AxAx = AxeAx = 0 if A x 9^ 0 . 

10.08 Matric subalgebras. Let A be an algebra which contains the identity 
and let e x be a primitive idempotent element; then = 1 — G is also idempotenl 
and, if eiAej is denoted by A a, then 

A = (ci + e a )A(e 1 -f ej = An + A Xa + + A aa . 

Suppose in the first place that A a iAi a is not nilpotent; there is then some an < 
Ai a such that - 4 ^1^12, which is an algebra, is not nilpotent since otherwise A <*i-l ux 
would have a basis of nilpotent elements, which is impossible by Lemma 1 ; hence 
some such A a \an contains an idempotent element, say e» = <2*21/212. L <'2 i*s Not 
primitive in A, say c 2 = e f + e" } e'e" = 0 = e"e f , where e' is primitive in A, 
then a^e' 5^ 0 since otherwise 

e' = e^e' = a^iOne' = 0; 

also e ' < A aa since 0 = e ie 2 = eie f + eie" so that eic " = — (he' and therefore 

eie' = — CicV = 0 

and similarly e'e\ = 0 ; we may therefore take a[« = a^c' and a 2 , = c'(hi in place 
of an and a 2 1, which gives e r in place of c 2 . We can therefore assume an so chosen 
that Go is primitive in A; also, since e^a^ane^ = e\ = c 2 , then, replacing (o>i by 
e 2 a 2 i, if necessary, we may assume e 2 a 2 i = a 2 i and similarly a j 2 c 2 = a ]2 . 

The element ai 2 a 2 i is not 0 since 

2 

(221^12^21^12 = (2 2 i(2i2 * U 2 i(2i 2 = 6o = C 2 , 

and it is idempotent since 

(tti 2 a 2 i) 2 = (212* (221(212 -a 2 i = (2i 2 C 2 (2'2l = ai2«2i. 

But 012(221 < < Ain and, since Ci is primitive, it follows that = Ci. 

For the sake of symmetry we now put an = c x , a 22 = (’2, and we then have a 
matric subalgebra of A, namely an, an, a 2 i, a 22 - 

Since A a i(Ai a A a i) r Ai a = (A«iAi a ) r+l , it follows that. A Ux A a i and A fr idj„ are 
either both nilpotent or both not nilpotent. Suppose that both are nilpotent; 
then, since their product in either order is 0 , their sum is nilpotent. and, because 
(Alia + Alai ) 2 = Ai a A a i + A aX A Xa , it follows that 

Nl = A] a “b A aX H" AlaAal d“ A a) A la 

is nilpotent. Now 

ANl = (Al 11 -f- Ala + Alai A act) (A Xa + Alai 4" A 1 a A «i ~\~ A«iAi a ) 

= AuAia + A n A laAal + A Xa A aX ~b A\ a A a \A\ a -f- A a \A\ a 
+ Al a iAi a Al a i -f* d aa Al a l -f" A aa AalAia 

< Ni 

since AijA pq = 0 (p ^ j), AijA jq < A iQ . Similarly NiA < N x . Hence Ni lies 
in the radical of A. 


[ 10.08 ] 


MATRIC SUBALGEBRAS 


157 


Suppose that we have found a matric subalgebra an (i, j = 1, 2, • • • , r — 1) 
such that =^a« (i = 1, 2, • • • , r — 1 ) are primitive idempotent elements of A ; 

let c a — 1 — ^ fi; and set -4*,- = e;Ae,- as before. Suppose further that A tt iA ia 
1 

is not nilpotent for some i; we may then take i = 1 without loss of generality. 
By the argument used above there then exists a primitive idempotent element 
r = cirr < A al A la and elements a rl < A al , a lr < A la such that 

CL r \CLir :=: a rr , Ui r tt r i = $ 11 , 

CL rr Cl r l = Q^l, airU rr = CLir. 

If we set 


Q>ir — CtilCllrj &ri — d r \^li (l ~ 1, 2, • * • ,7* — 1), 

then a^ r 5 ^ 0 since an a t > = a ir , and a a (i, j = 1 , 2 , • • • , r) form a matric algebra 
of higher order than before. 

Again, if every A a iA{ a is nilpotent, it follows as above that each Ai a A a i is also 
nilpotent and hence 

r- 1 

Nr-1 = ( Aj a + Aai + Ai a A a j -f- AaiAia), 

* > i~l 

having a nilpotent basis, is itself nilpotent; and it is readily seen as before that 
it is invariant and therefore belongs to the radical of A. 

We ca,n now treat A aa in the same way as A, and by doing so we derive a set of 
matric algebras il/ ? ,(a 7 - y ; i,j= 1,2, • • • , r p ) with the identity elements 

r P 

a„ = 2 «<< 

i = L 

such Hint 2 a„ = 1 ; also 

N' = ^ {a p Aa„ -f- a^AaqAap) 

P7* <1 

is contained in the radical A of A. We have therefore the following Lemma. 

Liomma 2. If A is an algebra with an identity , there exists a set of matric sub- 
algebras M ,, = (a *j m , i, j = 1,2, * • • , r p ) with the principal units 

a„ = ^ a v n (p = 1, 2 , ■ ■■ ,k) 

i = 1 

such that a,, a tl = 0 (p 5 ^ </) and 'Sap = 1, and such that 

A ' = ^ («,Aa, + apAa-qAa,) 

V9* <1 

lies in the radical N of A . Further each c V H is a primitive ide?npotent demerit of A. 
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Corollary. Bi- = <ikA a* -f- N' is an invariant subalgebra of A. Tor 

ABk — (cikAcik -J- OpA a q -}- ctpActqAdp) 

P^Q 

= (lhA. Clf; — {— -ZV / = Bk. 

10.09 We shall now consider the properties of the algebras <t p A a p where a v 
(p = 1, 2, • • • , k) are the idempotent elements defined in Lemma 2. 

Lemma 3. a p Aa p is the direct product of M v and an algebra B p m which the 
principal unit is the only idempotent clement. 

The first part of this lemma is merely a particular case of Theorem 2. That 
B p contains only one idempotent element is seen as follows. If e is a, primitive 
idempotent element of B PJ then a^and ah(a P — e ) are distinct and, if not zero, 
are idempotent and lie in a p lx Aa h] but this algebra contains only one idempotent 
element since ah is primitive; hence ah{a v — e) = 0, and therefore a = a„ is 
the only idempotent element in B p . 

Lemma 4. If B is an algebra whose principal unit 1 is its only idem potent element, 
any element of B which is singular 7 is nilpotent ; and the totality of .such elements 
forms the radical of B. 

The proof of the first statement is immediate; for, if a is singular, the algebra, 
{a} generated by a does not contain the principal unit and, since B contains no 
other idempotent element, a is nilpotent by Theorem 4. To prove the second 
part, let x and y be nilpotent but z = x + y non-singular; then 1 = z tr f 
z~ l y = ri -f 2/i- Here and y\ arc singular and therefore nilpotent,. If m is t he 
index of x h then 

(1 — a: i) (1 + #i + + * * * + Bi l ) — 1 

and this is impossible since yi — 1 — Xi is nilpotent. Hence z is also nili)ot(mt 
and the totality of nilpotent elements forms an algebra; and this algebra is 
invariant since the product of any element of B into a nilpotent element is 
singular and therefore nilpotent. It follows that B is a division algebra whenever 
it has no radical, that is, when it is semi-simple. 

10.10 The classification of algebras. We shall now prove the main theorem 
regarding the classification of algebras in a given field F. 

Theorem 5. (i) Any algebra which contains an identity can be expressed in the 
form 

(14) A = S + N 

7 An element of B is singular in B if it does not have an inverse relatively to the principal 
idempotent element of B. 
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where N is the radical of A and S is a semi-simple subalgebra; S is not necessarily 
unique hut any two determinations of it are simply isomorphic. 

(ii) A. semi-simple algebra can be expressed uniquely as the direct sum of simple 
algebras. 

(iii) A simple algebra can be expressed as the direct product of a division algebra 
I) and a simple matric algebra M; these are not necessarily unique but , if Mi, 
I)*, M 2 are any two determinations of D and M, then A Dz, Mi M%. 

We have seen in Lemma 2 that A = ^a v Aa p + A r/ , where N' < N, and also in 
Lemmas 3, 4 that a p Aa p = M p X B P) where M v is a simple matric algebra. The 
first part of the theorem therefore follows for A when it is proved for any algebra 
like B p and when it is shown that the direct product of M p by a division algebra 
is simple; for, if B p = B p + N P , then D p is a division algebra and 

a p Aa p = M p X D p + M p X N P , M P N P < N. 

If the field F is one in which every equation has a root, the field itself is 
clearly the only division algebra and hence M P D V = M p ; in this case part (i) is 
already proved. Further, the theorem is trivial for algebras of order 1 ; we may, 
therefore, as a basis for a proof by induction assume it is true for algebras of 
order less than the order a of A. 

If the field F is extended to F(£) by the adjunction of an algebraic irrationality 
t 0 f degree p + 1 , we get in place of A an algebra A' = A (£) which has the same 
basis as A but which contains elements whose coordinates lie in F(£) but not 
necessarily in F; all elements of A are also elements of A'. Regarding A' we 
have 1 he following important lemma. 


Lumma f>. If N is the radical of A , the radical of A' = A(£) is N' = iV(£). 

Lot A — C + N, C N = 0, and let the radical of A' be N"; then clearly 
N" > N'. If N" > N', there is an element of N" of the form 

c" = r, + <!,$+ • • • + <■>¥, (d < c, Co ^0). 

Since c" is nilpotent, 

() = tr(c") = tr(ro) + $tr(fi) + • • • 

and since tr(r„), tr(c), • • • arc rational in F, each is separately 0. But, if a h a* 
are arbitrary elements in A, 

aic v a>2 = aiCoa* + aidant; + * * * 

lies in N" and, since each OiC,o» is rational in F, the trace of each is 0 as above. 
Hence (fie trace of every element in Ac 0 A is 0 from which it follows by Lemma 1 
that AcoA is nilpotent and being invariant and also rational it must lie in N (cf. 
§10.06). But AcoA contains Co since A contains i whereas C ^ N = 0; hence no 
elements of N " such as c " exist and the lemma is therefore true. 

We may also note that, if B, C are complexes for which B^C = 0, and B', C' 
the corresponding complexes in A', then also B' ^ C' = 0. 
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Suppose now that the identity is the only idempotent element of A and that 
the first part of the theorem is true for algebras of order less than a. Let a 1 
be an element of A corresponding to an element a of (.4 — TV) and let. /(X) be 
the reduced characteristic function of is irreducible in F since (A — TV) 

is a division algebra. Since /(a) — 0, it follows that /(a) < TV and hence, it r 
is the index of f(a ), the reduced characteristic function of a is [/(X)]'\ If we 
adjoin to F a root £ of /(X), this polynomial becomes reducible so that in A ' = 
A(£) the difference algebra (. A r — TV') is no longer a division algebra though by 
Lemma 5 it is still semi-simple. If we now carry out in F(£) the reduction given 
in Lemma 2, say 

A f = + TV*, 

either the algebras e p A'e p are all of lower order than a, or, if A ' = <’\A V,, (hen if 
contains a matric algebra M' of order n 2 (n > 1) and, if we sot. .4 ' = M'B', as 
previously, B' is of lower order than a. In all cases, therefore, part (i) of the 
theorem follows for algebras in F(i-) of order a when it is true for algebras of 
order less than a, and its truth in that case is assumed under the hypothesis of 
the induction. 

We may now assume 

A = C + N, C ss N = 0, 

A' = S' + TV', S' ^ N' = 0, 

where S' is an algebra simply isomorphic with (A' — TV '); TV' has a rational 
basis, namely that of N (cf. Lemma 5). 

If ci, c 2 , • • • is a basis of C then, since A is contained in A' we have 

Ci = S- + m'i, s'i < S', m'i < N', (i = I, 2, • • •) 

and, since C ^ TV = 0 implies C ^ TV' = 0, it follows that .s\', • • • form a 
basis of S', that is, we may choose a basis for S' in which the elements have the 
form 


Ci -f- Mio 4- ??ii£ + • • • (c{ < ( \ tiij < ;V) 

where c. i} /?, 0 , ••• are rational in F. Moreover, since C is only determined 
modulo TV, we may suppose it modified so that n i0 is absorbed in r,; we then 
have a basis for S' 

( lb) $ i = a wn£ 4- • * • -f- p £ p = Ci 4" a t - . 

When the basis is so chosen, the law of multiplication in S', say 
(!6) s-s'- = ?a- ijk s' k 

has constants a ijk which arc rational in F; for s • = a mod N' and c, is rat ional. 
If we now replace s' k in (16) by its value from (15) and expand, we have 

CiCj + C(n j + n jC,- + n i n ,■ — 2 c ij k c k -f- 2o ijk n' k , 
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but n • n \ < (N f ) 2 and therefore 

c i c j + c i n i + n i c j — ^ a nk c k + ^ a ijk n k mod (N ) 2 , 

a relation which is only possible if the coefficients of corresponding powers of £ 
are also equivalent modulo (N') 2 and in particular 

CiCj = 2a ijk Ck mod (N') 2 . 

Consequently the algebra Ai generated by Ci (i = 1, 2, * • ■ , o) contains no 
element of N which is not also in N 2 and hence, except in the trivial case in which 
N = 0, the order of Ai is less than a. By hypothesis we can therefore choose C 
rationally in such a way that CiCj = 'EoijkCk, that is, such that C is an algebra; 
part (i) of the theorem therefore follows by induction. 

1 0,1 1 For the proof of part (ii) we require the following lemmas. 

Lemma 6. If A contains the identity 1 and if B is an invariant subalgebra which 
has a principal unit e, then 

4=50(1- e)A(l - e). 

Since c is the principal unit of B , which is invariant, eAe = B ; also eA( 1 - e) 
and (1 - c)Ac are both 0 since Ae and cA lie in B and, if b is any element of B } 
then (1 - c)b = 6-6 = 0, 6( 1 - c) = 6 -6 = 0; hence 

(17) A = cAc + (1 - v) A(i - c), cAc ~ (1 - c ) A ( l - c ) = °* 

Further cAc- (1 - c)A(\ - c) = 0 = (1 - c) A(l - c)-eAe, so that the sum in 
(17) is a direct- sum. 

Lemma 7. livery invariant subalgcbra B of a snni-simple algebra A is semi- 
dimple and therefore contains a principal unit. 

Suppose that B has a. radical N ; then 

AN < B, (AN) 2 = ANA -N < BN < N 

so that AN is nilpotent. But, since A 2 = A, we have (ANA) r = (AN)’ A; 
henev A NA is a. nilpotent invariant subalgebra of A which, since -1 contains an 
idcnl ily , is not 0 unless N is 0. But ,1 has no radical; hence N = 0 and li also 
has no radical. 

In consequence of these lemmas a simple algebra is irreducible and a semi- 
simple, algebra which is not also simple can be expressed as the direct sum of 
simple algebras. Let 

A = Ih ® B, ® • • • © B„ = O ® C' s ® • • • ® C„ 

be two expressions of A as the direct sum of simple algebras and let the principal 
units of Ih and be 6; and a respectively; then 1 = 26; = 2c,-. We then 
have C k < ZbiCibj < C k and therefore 

C k = 26A6,- = SbiCubi 
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since when i ^ j then biCkbj < Bi ^ Bj — 0 and biCihrbfiihj = 0. If b/. rfn ^ 
0, it is an invariant subalgebra of Ck and, since the latter is simple, we have 
biCifii = Ck for this value of i and all other bjCkbj equal 0, and therefore Ck = 
biAbi — B{. The second part of the theorem is therefore proved. 

10.12 We shall now prove part (iii) of Theorem 5 in two stages. 

Lemma 8. If D is a division algebra and M the niatnc algebra (gl i , j = 1, 2, 
• • * , m ) , and if D X M = DM, then DM is simple. 

Let D be a proper invariant subalgebra of A = DM. II x is an element- of />, 
then there exists an element y of -4 such that .r?y = 0, since otherwise we should 
have B > rad = dL, in other words, every element of D is singular in A and 
hence B ^ D = 0. But 

£ = Sdijdj, dij < I) 

and da — e p ixe ip and is therefore contained in B as well as in I). Since 

p 

B I) = 0, every da = 0, that is, x = 0 so that B — 0. It follows immediately 
from Lemma 2 that a simple algebra always has the form D X M , and also ( hat 
D ~ euAcu . 

Lemma 9. In a simple algebra all primitive idempotent dements are similar. 

Let e and a be primitive idempotent elements of a simple algebra A . We run 
then find a matric algebra M = (e t -y) for which en = e and such that A -- I> X M f 
where D is a division algebra. If ea = 0 = ae, we can at the same l-inict choose 
e 22 — a; and e 2 2 = ucnvr 1 where 

u = 1 - e n - e 22 + 6*12 + <?2i = w' 1 , 

so that the lemma is true in this case, and we may therefore assume I hat, say, 
ea ^ 0. 

Suppose now that eae 5^ 0. Since A = D X M, we ran express a in the form 
Zajj-eij (a.ij < D), where an 0 since cac = ctu(*n. We have then 

(ea) 2 = (e n a) 2 = (anC u + a l2 c l2 + * • • ) 2 = ancm* 

and hence b = al\ca is idempotent. We then have 

eb = b, be — e, ba = b; 

also ab = aba = abab and, since a is primitive, either aha = a or aba = 0; but 

eabe = eae ^ 0, 

hence ab = a. We then have 

b = ueu~ l j u = 1 — b + e, 
b = y = 1 — b + a, 

and hence a and e are similar in this case also. 


u~ l =1-| - b — a, 
v~ l = 1 + b — a, 
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If eaG — ® kuf aea ** 0> interchanging the roles of e and a leads to results 
similar to those just obtained; we can therefore assume eae = 0 = aea. If 

u — 1 -j- e — ea ae — 2(2 — e + ea — ae)" 1 , 

then uau~ l = a — ae; we can therefore assume ae = 0. If 

v = (1 -f e — 2ea) = 2(2 - e + Sea)*" 1 , 

then ww" 1 = a — ea; we can therefore also assume ea = 0, which brings us back 
to the first case which we considered. The lemma is therefore proved. 

Part (iii) of the theorem follows immediately. For, if e and a are primitive 
idem potent elements of Mi and M2 respectively, we ‘can now find w such that 
a = wew ~~ l ; but A ^ eAe and l)o^ aAa = weAew” 1 , which is similar to eAe and 
therefore to D x . 

1 0.13 Semi-invariant subalgebras. If B is a subalgebra of A which is such that 
AB < B (BA < B) y it is called a right (left) semi-invariant subalgebra. We 
shall treat only the case in which A is semi-simple; it has then an identity and if 
we restrict ourselves, as we shall, to the case of right semi-invariant subalgebras, 
we may assume AB = B. 

It is clear that, if A = A x © A 2 , then also B = B x © 1 B 2 , where A Si — 
B i , A i B j = 0 (i 7* j) . It is sufficient then at first to consider only simple algebras, 
and in this case we have the added condition that ABA — A; that is, we have 
simultaneously 

(18) AB = By ABA = A. 

If we call B minimal when it contains no other semi-invariant subalgebra, we 
have 

Lumma 10. A minimal right semi-invariant subalgebra of a simple algebra A has 
the form An, where u is a primitive idempotent element of A. Conversely, if u 2 = u 
is primitive , A u is a minimal right semi-invariant subalgebra. 

L(‘t AC — C; if ci 0 is any element of C, and t. y i = Ac x < C, then AC\ — 
(\. Suppose (A < C; then in the same way if c 2 is any element of Gh, we have 
(G = Ac -2 < Ci. If C 2 < Ci, we may continue this process and after a finite 
number of steps we shall arrive at an algebra B 9^ 0 such that Ab = B for every 
element b of B which is not 0. Since A is simple, AbA = A and B 2 = B, so that 
B contains a primitive idempotent element u and Au = B. If u is not also 
primitive in A, let u — u x + Uz, uoij — 0 (i 7* j), u\ — iu ^ 0. Then u x u — u x 
so that //1 is in B; hence u must be primitive in A, if it is so in B. 

Since B = Au, every x in B has the form au and hence xu = x. But also B — 
Ax and, from the manner in which B was chosen, either Bx = 0 or Bx = B. 
If Bx = 0, then ux = 0 and therefore 


x 2 — xu -xu = 0. 
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Also, if x is nilpotent, then x 2 = 0 = ux; for uAu = uBu is simple since, by the 
proof of Lemma 4, it is a division algebra, and ux = uxu < uBu. If Bx = Ii, 
then there is a unique b such that bx = x and, since b is then idernpotent, we 
have ux = x, that is, x lies in uAu. If B = Au , then AB = A l u < B so t hat B 
is a right semi-invariant subalgebra of A. If C is minimal, then B = 0 as 
desired. 

Conversely, let B = Au , u primitive; then the only idernpotent quantity of B 
has been shown above to be u and, if B were not primitive, we should have 
B > C = Ay, v primitive, which is impossible. 

Suppose now that B is not minimal and let e h e*, * • • , e r be a complete set. of 
primitive supplementary idernpotent elements in B. Then B r = A(‘i + Ae>> + 

• • • + Ae r is semi-invariant in A. Let b be an element of B which is not in 
B r ] since b Sic*, we may replace b by b — S&c; and so assume every b('i = 0 in 
which case clearly A b ^ B r = 0. But, if b t*- 0, then A b contains an idernpotent 
element e such that = 0 (i = 1, 2, • • • , r) and = e — is an idernpotent 
element supplementary to the given complete set, which is impossible. We 
therefore have the following theorem. 


Theorem 6. If A is simple and AB = B is a semi-invariant subalgebra , then 
B = Ae i -j- Ae 2 -f - * * * “H Acr 

where ei, • • • , c r is a complete supplementary set of primitive idem potent elements 
of B; and these idernpotent elements are also primitive in A . 

We shall assume that A is semi-simple, say 

(19) A == Si © • * • © St, 

when each Si is simple and 


( 20 ) 


Si = Di X Mi. 


As previously (cf. Lemma 2) we may set Mi — (e] t</ ), ]>, <( = 1,2, • • • , //,, where 
e x vv form a set of supplementary primitive idernpotent. elements and e' vv ™ 1 . 

> , i> 

If B is any invariant subalgebra, (hen B = ^ Be] tp and Bej, n is a rigid semi- 

*» V 

invariant subalgebra; if B is minimal, we have already seen that it has (he form 
Bu where u is a primitive idernpotent element, and ther(‘fon k we have B = 
Be l pp = SiC pp for some i and p. If set B ip = Siej lpJ then 


Bi p c pq — S i e. pp e pq M i D i c pq — S i c (jq — b iq . 


We have therefore the following theorem. 


Theorem 7. If A is semi-simple and is given by (19), and if e pp form a complete 
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set of supplementary primitive idempotent elements such that h / e' vv = U{ is the 

pal 

identity of S i, then every minimal right semi-invariant subalgebra has the form 
( 2i ) B ip = S4 P . 

Moreover, there is a number e* in S { such that 


( 22 ) 


B iv e U = B ir 


10.14 The representation of a semi-simple algebra. Let A be a linear as- 
sociative algebra over F with the identity 1, and designate elements of A by a. 
A representation of A is a set, {7(a), of matrices of order n such that a — > U(a) 
is a correspondence between the elements of A and the matrices of the set in 
which the following conditions are satisfied 


(23) 4/(1) = l n , U(a + b) = U(a) + 17(6), U{ab) = 17(a) 17(6), 

U (aa) == all (a) 

for every a and b of A and every scalar a in F. 

We can now, as in chapter I, associate with the matrices U(a) a vector space R 
with a given fundamental basis, and a change of basis corresponds to replacing 
U (a) by PU(a)P~\ an equivalent representation (cf. 1.08). A subspace Ah of R 
is invariant under A(cf. 5.16) if every matrix U(a ) carries each vector of Ah 
into a, vector of Ah. If Aa ^ 0, we may set R = Ah + Ah (Ah ^ Ah = 0); and 
since we are only interested in the equivalence of representations, we may 
suppose the basis R so chosen that 


m 


Ufa) Ufa) 
0 Ufa) 


The representation is said to be reducible in this case, and it is evident that both 
Ufa) and Ufa) give representations of A. 

If R has no proper invariant subspace, then U(A) and R are said to be irre- 
ducible. It. is now clear that we may write 


R = Ah + lit + • • • + Ah 

where Ah = Ah + * • • + Ah is the invariant subspace of least order which con- 
tains Ah-i, (Ah) = 0), and in this case 


(25) 


U(a) 


Uix{a) 

U 12 (a) • • • 

UM 

0 

J7«(o) ••• 

U u (a) 

0 

0 ••• 

UM 


and the representations Ufa), • ** , Ufa) are irreducible. If in addition Ah, 

• • * , Ah arc themselves invariant for some t, then Uifa) = 0 ( 19 * j; i, j = 1, 2, 

* • • , t), and we say that U (a) is decomposable. 
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A particular case of fundamental importance arises when we take h to be A 
itself, that is, if x is a variable element of A, then x f = ax corresponds to a linear 
transformation in the basis of R (or A), say 

x' = ax = U(a)(z), 

and U(a) has the property given in (23) and so is a representation of A. It is 
obviously the representation of (6) and is one-to-one; it is called the regular 
representation. 

The invariant subspaces of A are evidently its right semi-invariant sub- 
algebras B. If £i, 62, * * * , e t is a basis of B and 

(26) ad = 

then the matrices U(a ) = || an || give a representation of A on the sub, space B. 
Suppose now that V(a) is a given representation, R the corresponding subspare, 
and B a right semi-invariant subalgebra of A. If y is any vector of Ji, Huai the 
set of vectors of the form V(b)(y) is an invariant subspace of R, since 

(27) V(a)V(b) = V(ab) = V(b a ), K < B. 

From (27) it is seen immediately that the set B’ of elements V in B for which 
V(b')y = 0 forms a right semi-invariant subalgebra of B and hence, if B is 
minimal, either B r = 0 or B' — B, If B' = 0, then V(e t )y, • • • , Y{e t )y is a 
basis of the set (V(b)y) and 

V(a)V(ei)y = V{ae x )y = 2*iiV(c u)y. 

But then the vectors of the form V(b)y give a representation of A cupii vn.l( h nt. 1 <> 
that determined by B in (26). 

We shall now prove the following theorem. 

Theorem 8. If the regular representation of an algebra is decomposable , then 
every representation is decomposable and its irreducible components are contained 
in the regular representation. 

Suppose that the regular representation of A is decomposable; then .1 
B x + Z? 2 + • • • + B S} where the Bj arc irreducible equivalent subspaeos of A, 
that is, minimal semi-invariant subalgebras such that Bj ^ B k = 0 for j A- k. 
Let yij ?/2, • * • , y n be a basis of the space R of a representation of A . Since .1 
has an identity, we have 

m) R = Alt = BJi + B 2 R + • • • + BJi 

= Bpji + #22/2 + • • ■ + Btfji + * * * + R*Vn- 

As we have seen above, if B k yj A 0, it is a subspace of R which gives a representa- 
tion equivalent to that given by B h ; it follows that either Bj c ]jj = 0 or it is an 
invariant subspace of R. 

The intersection of the invariant subspaces is also invariant so that either 
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Ih/f/j B P y q — 0 or B k y.j = B p y q ; hence we may select from the spaces B k y q in 
(28) a set of independent irreducible invariant subspaces determining R. This 
proves "Theorem 8. 

Consider now a semi-simple algebra 

A = Si © * * * ® S r 

where S h is a simple algebra. We may write 

1 = 2sUij (i = 1 , ■ • • , Tj j = 1 , * * * , nf) 

where the form a complete set of supplementary primitive idempotent ele- 
ments of A. Then 

A = HjAuij — SjBjj 

where B^ = Aua is a minimal right invariant subalgebra of A. We have then 
decomposed A into irreducible invariant subspaces and have proved the first 
part of the following theorem. 

Theorem 9. The regular representation of a semi-simple algebra is decomposable , 
and its reducible components are those obtained by the use of the Bn as representation 
spaces. The representations given by any pair Bn, B ik are equivalent while Bn, 
B t k give inequivalent representations for j k. 

.'For by Theorem 7 we have B i3 e) k = B ik so that the proof of Theorem 7 with 
y = e) k shows that the representation by Bn is equivalent to that by B ip . In 
the representation by B i3 - we have 


n 



J = 1 


where 1 ni is an identity matrix corresponding to the identity transformation on 
B t j since m is the principal unit of B ih But in the representation by B tk , we 
have c.i •> 0. Evidently these representations cannot be similar. 

10.15 Group algebras. If & = (c/i = 1, , g m ) is a finite group, the 

group relation gnj] = ga is a. particular case of the associative product defined in 
(2) and, when it, is used in conjunction with addition, we get an associative 
algebra 0 of which (c/i, g*, • • • , g m ) is a basis and g\ the identity. 
r Phe representation of & as a regular permutation group 

\Q il 0i2 ' * * Qim/ 

corresponds to the representation of G as a set of matrices, the matrix hi being 
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Since i p = p, that is, g t g p = g p , only when gi is the identity, the matrix h has no 
coordinate in the main diagonal except for i = 1 in which case hi is the identify 
matrix; hence 

(29) tr(Ax) = m, tr (hi) = 0 (/’ 1). 

It follows from this that G is semi-simple. For if u — hijihi is the matrix cor- 
responding to some element of the radical N, then tr(u) = 0 since u is nilpotent. 
If u 0, some coordinate, say rj p , is not 0 and in which also corresponds (o 
some element of N, the coefficient of h L is not 0; we may therefore assume i)\ () 

provided N ^ 0. But using (18) we get 

0 = tr (u) = S^tr (/?.<) = mrn; 

hence the assumption that u ^ 0 leads to a contradiction and therefore M := 0, 
that is, G is semi-simple. This gives the following theorem. 

Theorem 10. A group algebra is semi-simple. It is therefore the dived sum of 
simple algebras and ) if the field of the coefficients is sufficiently extended, it is the 
direct sum of simple matric algebras. 

The whole of the representation theory developed in the previous section can 
now be applied to groups. 


APPENDIX I 
NOTES 1 
Chapter I 

This calculus of matrices was first used in 1853 by Hamilton (1, p. 559ff, 480ff) under the 
nanus of Linear and vector functions.” Cayley used the term ?natrix in 1854, but merely 
for a scheme of coefficients, and not in connection with a calculus. In 1858 (2) he developed 
the basic notions of the algebra of matrices without recognizing the relation of his work to 
that of Hamilton; in some cases (e.g., the theory of the characteristic equation) Cayley 
gave merely a verification, whereas Hamilton had already used methods in three and four 
dimensions which extend immediately to any number of dimensions. The algebra of 
matrices was rediscovered by Laguerre (9) in 1867, and by Frobenius (18) in 1878. 

1.03 Matric units seem to have been first used by B. Peirce (17); see also Grassmann (5, 
§381). 

1.10 For the history of the notion of rank and nullity see Muir, Theory of Determinants , 
London 1900-1930; the most important paper is by Frobenius (290). 

Chapter II 

2.01-03 The principle of substitution given in §2.01 was understood by most of the early 
writers, hut was first clearly stated by Frobenius, who was also the first to use the division 
transformation freely (20, p. 203). 

2.04 The remainder theorem is implicit in Hamilton’s proof of the characteristic equation; 
see also Frobenius (280). 

2.05 12 The characteristic equation was proved by general methods for n — 3, 4 by Hamil- 
ton (1, ]). 507; 8, p. 484-ff; of. also 4, 6). The first general statement was given by Cayley 
(2); the first general proof by Frobenius (18). See also the work of Frobenius cited below 
and 9, 10, 39, 41, 56, 59. 

Hamilton, Cayley and other writers were aware that a matrix might satisfy an equation 
of lower degree than n, but the theory of the reduced equation seems to be due entirely to 
Frobenius (18, 140). 

The theory of invariant vectors was foreshadowed by Hamilton, but the general case was 
first handled by Grassmann (5). 

2.10 See Sylvester (42, 44) and Taber (96) ; see also 252. 

2.13 The square root of a matrix was considered by Cayley (3, 12), Frobenius (139) and 
many others. 

Chapter III 

3.01 The idea of an elementary transformation seems to be due in the main to Grassmann 

( 6 ). 


i In these Notes, numbers refer to the Bibliography unless otherwise indicated. 
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3.02-07 The theory of pairs of bilinear forms, which is equivalent to that of linear poly- 
nomials, was first given in satisfactory form by Weierstrass (see Muth, 175) although the 
importance of some of the invariants had been previously recognized by Sylvester. The 
theory in its matrix form is principally due to Frobenius (18, 20). 

The theory of matrices with integral elements was first investigated by Smith (see Muth, 
175) but was first given in satisfactory form by Frobenius (20). The form given in the text 
is essentially that of Kronecker (92). 

3.04 The proof of Theorem 3 is a slight modification of that of Frobenius (20). 

3.08 Invariant vectors were discussed by Hamilton (1, 8) and other writer# on quaternion# 
and vector analysis. The earliest satisfactory account seems to be that of (Jrassmann (5), 

Chapter IV 

The developments of this chapter are, in the main, a translation of Kroneeker’s work 
(see Muth, 175, p. 93ff). See also de Seguier (259). 

Chapter V 

5.03 From the point of view of matrix theory, the principal references are Sehur (198), 
Rados (105, 106), Stephanos (185), and Hurwitz (117). See Loewy (284, p. 138) for addi- 
tional references; also Muir, Theory of Determinants , London 1900-1930. 

5.09 Non-commutative determinants were first considered by Cayley (Phil. Mag. 20 
(1845), 141-145); see also Joly (195) and Sylvester (43). 

5.10-11 See Loewy (284, p. 149); also 176, 178, 185, 19S. 

5.12 The principal references are Schur (198) and Weyl (440, chap. 5). 

Chapter VI 

. ?° T K f"® rul ref erences see Loewy (284, pp. 118-137), also Muth (175), Hilton (314, ehap 
0, 8) and Muir, Theory of Determinants, London 1900-1930. 

6.01 The method of proving that the roots are real is essentially that of Tail (10 chan 51 • 
see also 36, 60, 228, 399. ’ ‘ 1 * 


6.03 See Loewy (284, pp. 130-137), Baker (215) and Frobenius (202). 
113, 114, 115, 124, 135, 139, 210, 221, 273, 302, 307, 320, 371, 400, 414, 4li(i, 476. 

6.04 See Dickson (392). 

6.05 See Loewy (284, pp. 128-135). 


See also 7, 18, 99, 


6.07 For references see Muth (175, p. 125) and Frobenius (139). 


L/ II A PTE R VII 


7, 10 02 S f e Cyley (2), Frobenius (18), Bucheim (59), Taber (98, 112), and Hilton (314 
chap. 5); also 83, 86, 98, 137, 184, 197, 209, 223, 242, 250, 264, 301, 382. ’ 


7.03 See Frobenius (280). 

7.05 See Frobenius (140); also 350. 


7.06-07 See Sylvester (42, 44) and Taber (96); see also 252. 
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Chapter VIII 

8.01-03 See Sylvester (36), Bueheim (59, 69); also 134, 371. 

8.02,07 See Hamilton (1, p. 545ff; 8, §316), Grassmann (5, §454), Laguerre (9). Many 
writers define the exponential and trigonometric functions and consider the question of con- 
vergence, e.g., 79, 80, 103, 389, 449; also in connection with differential equations, 13, 133, 
258. 


8.04-05 Roots of 0 and 1 have been considered by a large number of writers; see partic- 
ularly the suite of papers by Sylvester in 1882-84; also 18, 67, 76, 107, 242, 255, 264, 277, 279, 
381,411,430,474,539. 

8.08 See 20, 94, 246, 256, 257, 274, 303, 338, 399. 

8.09-11 The absolute value of a matrix was first considered by Peano (75) in a somewhat 
different form from that given here; see also 273, 348, 389, 472, 473, 494. For infinite prod- 
ucts see 133, 324, 326, 389, 494. 

8.12 In addition to the references already given above, see 10, 16, 18, 187, 418, 419, and also 
many writers on differential equations. 

Chapter IX 

The problem of the automorphic transformation in matrices was first considered by Cay- 
ley (3, 7) who, following a method used by Hermite, gave the solution for symmetric and 
skew matrices; his solution was put in simpler form by Frobenius (18). Cayley failed to 
impose necessary conditions in the general case which was first solved by Voss (85, 108, 162, 
163). The properties of the principal elements were given by Taber (125, 134; see also 127, 
149, 156, 158, 231). Other references will be found in Loewy (284, pp. 130-137); see also 9, 19, 
153* 1 54* 161* 167, 168, 169, 187, 229, 371. 
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